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For  almost  forty  years,  the  simplex  method  has  been  the  method  of  choice  for  solving 
linear  programs.  The  method  consists  of  first  finding  a  feasible  solution  to  the  problem 
(Phase  I),  followed  by  finding  the  optimum  (Phase  II).  Many  algorithms  have  been  pro¬ 
posed  which  try  to  combine  the  processes  embedded  in  the  two-pnase  process.  This  study 
will  compare  the  merits  of  some  of  these  composite  algorithms. 

Theoretical  and  computational  aspects  of  the  Weighted  Objective,  Self- Dual  Paramet 
ric,  and  Markowitz  Criteria  algorithms  are  presented.  Different  variants  of  the  Self-Dual 
methods  are  discussed. 

A  large  amount  of  computational  experience  for  each  algorithm  is  presented  These 
results  are  used  to  compare  the  algorithms  in  various  ways.  The  implementations  of  each 
algorithm  are  also  discussed  One  theme  that  is  present  throughout  all  of  the  computational 
experience  is  that  there  is  no  one  algorithm  which  is  the  best  algorithm  for  all  problems 
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Section  1.  Introduction 


Since  the  simplex  method  was  discovered  by  Dantzig  in  1947,  it  has  remained  the 
method  of  choice  for  solving  linear  programs.  It  is  used  every  day  by  people  from  many 
different  fields  and  in  many  different  applications.  While  other  methods  have  been  pro¬ 
posed  in  the  past  for  solving  linear  programs,  none  were  shown  to  perform  consistently 
as  well  as  the  simplex  method  (see,  for  example,  Hoffman,  et.al.,  1953).  Recently,  how¬ 
ever,  new  methods  have  been  proposed  based  on  Karmarkar's  (1984)  discovery  of  a  new 
polynomial- time  algorithm  for  linear  programming,  and  they  are  posing  serious  challenge. 
In  comparing  the  performance  of  these  algorithms  with  the  simplex  method,  there  is  an 
implicit  assumption  that  the  simplex  method  is  an  explicit  algorithm,  whereas,  in  practice, 
there  are  many  variants.  The  purpose  of  this  study  is  to  compare  the  variants  to  see  if  any 
one  variant  can  be  said  to  be  the  best  in  some  sense  and  could  be  used  as  a  standard  for 
comparison  with  non-simplex  algorithms. 

While  many  variants  of  the  simplex  method  have  been  proposed,  systematic  studies 
of  their  effectiveness  have  not  appeared  in  the  scientific  literature.  This  study  is  limited 
to  a  certain  class  of  variants  of  the  simplex  method,  called  composite  simplex  algorithms. 
This  report  contains  computational  comparisons  of  some  composite  simplex  algorithms, 
and  emphasis  is  placed  on  making  the  computational  experiments  replicable  by  other 
researchers. 

Section  2  defines  the  class  of  composite  simplex  algorithms  and  describes  a  generic 
simplex  algorithm  and  a  framework  whereby  all  the  composite  simplex  algorithms  are 
implemented.  Section  3  discusses  the  various  options  that  are  available  when  one  solves  a 
linear  program  on  a  computer.  The  influence  of  these  options  must  be  understood  when 
making  computational  comparisons  of  variants  of  the  simplex  method.  The  results  for  the 
standard  two-phase  algorithm  are  presented  here. 

Section  4  discusses  the  weighted  objective  algorithm.  Section  5  analyzes  the  Markowitz 
Criterion.  In  Section  6,  Dantzig’s  self-dual  parametric  algorithm  is  presented.  Some  vari¬ 
ants  of  the  self-dual  algorithm  are  presentea  in  Section  7.  In  Section  8,  the  computational 
results  are  analyzed  and  suggestions  for  further  research  are  offered. 

The  following  general  inferences  can  be  drawn  from  the  very  extensive  systematic 
trials  made  in  this  study: 

1 .  Almost  every  variant  performed  remarkably  well  on  some  problems  and  very  poorly 

on  others. 

2  No  variant  stood  out  clearly  as  the  best  on  all  problems. 
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Section  2.  The  Linear  Programming  Problem 


Section  2.  The  Linear  Programming  Problem 


In  this  section,  the  general  linear  programming  problem  is  discussed.  A  “generic” 
simplex  method  will  be  described  in  a  way  that  serves  as  a  framework  for  describing  various 
variants.  The  mathematical  programming  system  MINOS  (Murtagh  and  Saunders,  1983), 
suitably  modified,  was  used  in  the  experiments  on  test  examples  in  this  study.  A  description 
of  how  MINOS  solves  linear  programs  is  therefore  given.  Finally,  the  “MinSumlnf”  ratio 
test  is  discussed  as  an  alternative  ratio  test  which  can  be  used  in  some  methods. 

2.1  The  MINOS  Standard  Form 

A  linear  program  can  be  described  as  the  minimization  or  maximization  of  a  linear  form 
z  =  cTx  over  a  linear  constraint  set.  Each  of  the  constraints  in  this  set  may  be  an  equality 
or  an  inequality  constraint.  After  a  linear  program  is  formulated  mathematically  and 
numerical  values  assigned  to  its  coefficients  and  right-hand  side,  these  values  are  placed  in 
an  input  file  in  a  prescribed  format  in  order  for  a  computer  to  solve  it.  Once  the  data  is 
read  into  the  computer,  the  next  step  is  to  translate  it  into  an  equivalent  form  for  internal 
use  by  the  computer  program.  An  algorithm  is  then  used  to  solve  the  linear  program.  In 
the  case  of  MINOS,  the  form 


minimize  cTx 

subject  to  [A  I\x  =  0,  (2.1) 

l  <  x  <  u. 

is  termed  the  The  MINOS  Standard  Form. 

To  maximize  z,  the  user  can  minimize  —  z  and  obtain  the  same  optimal  solution  z. 
Henceforth,  it  is  not  restrictive  to  assume  that  the  objective  function  is  being  minimized. 
The  number  of  variables  is  denoted  by  n,  and  the  solution  is  then  given  by  some  z  6  3?n. 
Each  variable  x},  j  =  1,  . . . ,  n  is  termed  a  decision  variable  and  can  have  a  lower  bound 
l}  and  an  upper  bound  Uj.  Some  of  these  bounds  may  be  — oo  and/or  +00,  respectively. 

The  number  of  constraints  in  the  problem  is  denoted  by  m.  Each  constraint  is  de¬ 
scribed  by  a  vector  a*  €  Rn  and  two  scalars,  and  r<.  In  the  MINOS  manual  (Murtagh 
and  Saunders,  1983),  b i  is  termed  the  right-hand  side  and  r*  is  termed  the  range.  The  tth 
constraint  is  then 

6,  +  r*  <  afx  <  bi  if  rj  <  0, 

bi  <  afx  <  bi  +  if  r<  >  0. 

Note  that  if  r,  =  0,  the  constraint  is  of  the  form  afx  =  bi\  if  r,  =  +  oo,  the  constraint  is 
of  the  form  afx  >  bi;  if  r,  =  — oo,  the  constraint  is  of  the  form  afx  <  bi.  Given  these  m 
constraints,  m  slack  variables,  Xj,  j  =  n  +  1,  . . . ,  n  +  m,  can  be  added  to  transform  each 
of  the  constraints  into  equality  constraints.  This  is  done  by  setting  zn+,  =  -afx,  i  =  1, 
. . . ,  m,  and  placing  lower  and  upper  bounds  on  zn+,  that  were  previously  on  afx.  Hence 
the  linear  program  has  been  converted  to  the  form  (2.1).  Now  z  G  3?n+m  and  there  is  no 
need  to  treat  differently  the  slack  variables  and  the  decision  variables  of  the  problem,  since 
each  x}  is  considered  as  a  column  of  the  large  matrix  [A  I )  with  a  cost  coefficient  c}  and 
lower  and  upper  bounds  and  u;. 
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2.2  A  Generic  Simplex  Method 

The  simplex  algorithm  as  described  by  Dantzig  (1963)  (and  in  earlier  references)  is  ap¬ 
plicable  to  linear  programs  that  have  an  initial  primal  feasible  basis.  In  order  to  find  an 
initial  basis,  a  “Phase  I”  procedure  was  devised.  This  procedure  sets  up  a  linear  program 
which  could  be  solved  by  the  simplex  algorithm,  and  which  yields  a  primal  feasible  basis 
or  a  verification  that  the  problem  is  infeasible.  If  the  problem  was  feasible,  the  simplex 
algorithm  could  be  used  again  to  find  an  optimal  solution  (“Phase  II”).  The  combination  of 
first  finding  a  primal  feasible  basis  by  the  simplex  algorithm  and  then  finding  the  optimal 
solution  by  the  same  algorithm  is  collectively  known  as  the  simplex  method.  This  fine 
distinction  is  made  here  to  clarify  the  differences  between  the  simplex  algorithm  and  the 
simplex  method. 

Since  1947  when  Dantzig  proposed  the  simplex  algorithm,  many  variants  of  the  method 
and  algorithm  have  been  proposed.  Some  of  these  variants  are  listed  in  Dantzig’s  book 
(1963).  These  variants  have  a  common  framework,  which  can  be  described  by  the  following 
proto-  algori  t  hm : 

Comment  Let  B  be  a  basis  for  [A  /]. 

while  not  optimal(B)  do 
begin 

fs,g}  *—  indices  of  {incoming  -column ,  exiting  -column); 

Comment  Column  q  of  [A  J]  is  the  same  as  column  r  of  B. 

pivot(B,s,q,r); 

end 

The  boolean  function  optimal(B)  returns  true  if  B  is  both  a  primal  and  dual  feasible 
basis.  The  procedure  pivot(B,s,q,r)  assumes  that  xq  is  the  rth  basic  variable.  Column 
s  of  [A  I]  replaces  the  rth  column  in  the  basis,  which  is  the  same  as  column  q  of  [ A  I). 
This  procedure  usually  updates  a  factorization  of  the  basis  B.  The  order  of  execution 
of  the  functions  incoming -column  and  exiting -column  determines  which  simplex  variant 
is  being  executed.  In  the  case  of  the  primal  simplex  algorithm,  the  steps  (called  pricing 
out)  for  deciding  which  column  is  to  be  the  incoming-column  are  executed  before  the  steps 
(a  primal  ratio  test)  for  deciding  which  is  to  be  the  exiting -column.  In  the  case  of  the 
dual  simplex  algorithm,  finding  the  exiting -column  (determining  the  basic  variable  with 
the  most  infeasibility)  is  done  before  finding  the  incoming -column  (a  dual  ratio  test).  As 
will  be  seen  later,  some  algorithms  may  change  the  order  of  the  two  processes  depending 
on  the  status  of  the  current  iteration.  Furthermore,  the  actual  algorithms  that  define  the 
functions  incoming -column  and  exiting -column  may  change  as  iterations  progress. 

2.3  An  Outline  of  MINOS 

In  order  to  understand  the  implementations  of  the  various  algorithms  studied  in  this  study, 
it  is  necessary  to  understand  the  subroutines  that  MINOS  uses  to  solve  linear  programs. 
For  ease  of  discussion,  the  names  that  were  given  to  subroutines  in  MINOS  have  been 
changed  to  a  mnemonic  that  reminds  the  reader  of  the  subroutine’s  purpose.  After  read¬ 
ing  in  the  problem  specification  and  converting  the  problem  to  MINOS  standard  form 
(subroutine  input),  a  starting  basis  is  determined.  A  basis  is  easily  obtained  from  the 
matrix  (.4  /]  by  choosing  the  slack  variables.  As  discussed  in  Section  3,  it  is  usually  more 
efficient  to  determine  an  initial  “crash”  basis  to  begin  the  simplex  method  (subroutine 
crash).  Each  nonbasic  variable  is  initially  set  to  be  equal  to  one  of  its  bounds.  B  denotes 
the  index  set  . . .  ,jm)  of  the  basis.  The  order  of  the  basic  columns  is  important  and 

is  indicated  by  an  integer  vector  Bt,  i  =  1,  ...,  m  where  B,  =  j  indicates  that  x}  is  the 
fth  basic  variable.  Given  the  initial  basis  and  the  initial  states  of  the  nonbasic  variables, 
MINOS  executes  the  following  loop  until  termination: 
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proced  ure  pri maLsimplex ; 
finished  4—  false; 
while  not  finished  do 
begin 

if  needs  jactorization(B)  then  factorize(B)’, 
if  infeasible  Jbasis(B)  then  c  *—  d  else  c  *—  c; 

FindPI:  Solve  wTB  =  c,  for  n; 


jPriceOut:  Find  s  such  that  x,  has  minimum  reduced  cost  c4; 
optimal  *—  c$  >  0; 

if  not  optimal  then _ 

(Represent:  Solve  Bp  =  [A  /].,  for  p; 


RatioTest:  Find  r  such  that  xBr  blocks  the  change  in  x, ; 

unbounded  *—  r  =  0 
if  not  unbounded  then 

begin 

<7  <-  Br  ; _ 

Update:  Update  value  of  x; 

pivot{B,s,q,r ); 

end; 

end; 

infeasible  infeasible Jbasis(B)  and  optimal ; 
finished  4—  optimal  or  unbounded  or  infeasible’, 
end; 

The  boolean  function  needs -factorization(B)  is  true  if  the  basis  needs  to  be  refac¬ 
torized.  A  number  of  circumstances  can  contribute  to  this  condition.  MINOS  has  a 
FACTORIZE  FREQUEHCY  option,  which  determines  the  maximum  number  of  updates  to  B 
that  can  be  done  before  refactorizing  the  basis.  This  number  defaults  to  50.  MINOS  will 
also  refactorize  the  basis  if  the  size  of  the  current  factorization  has  increased  significantly, 
or  if  there  is  insufficient  storage  to  update  the  factors.  Because  of  the  modular  nature  of 
the  MINOS  code,  the  conditions  influencing  refactorization  are  external  to  the  variants  of 
the  simplex  method  that  are  being  considered.  Furthermore,  the  subroutine  factorize(B) 
and  the  routines  that  solve  the  equations  ir TB  =  c  and  Bx  =  b  can  be  treated  as  a  pack¬ 
age;  the  results  of  using  this  package  do  not  influence  the  paths  generated  by  the  different 
variants  of  the  simplex  method.  MINOS  uses  a  sparse  LU  factorization  of  the  basis.  The 
implementation  is  sufficiently  modularized  that  the  factorization  and  solve  routines  can  be 
replaced  by  some  other  equivalent  scheme  of  maintaining  the  factorization  of  the  basis,  if 
so  desired. 

The  boolean  function  infcasible-basis(B)  is  true  if  the  current  basic  solution  indicated 
by  B  is  primal  infeasible.  The  vector  d  6  Rn+m  is  defined  element  by  element  as 

(  1  if  X j  >  Uj’ 

di  =  \  -1  if  <  hi 

V  0  otherwise 

and  represents  the  Phase  I  cost  form  used  when  the  problem  is  infeasible.  Note  that  if  x} 
is  nonbasic,  then  dj  =  0,  since  each  nonbasic  variable  is  always  equal  to  one  of  its  bounds. 
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The  Phase  I  objective  vector  d  changes  on  each  iteration,  depending  on  which  variables  are 
infeasible.  Once  a  variable  is  feasible,  the  ratio  test  prevents  the  variable  from  becoming 
infeasible.  The  objective  in  Phase  I  never  increases  since  MINOS  chooses  incoming  vari¬ 
ables  based  on  the  cost  form  w  =  d?x.  However,  because  the  sum  of  infeasibility  changes  as 
a  function  of  the  incoming  variable,  it  is  possible  for  this  sum  to  increase.  MINOS  always 
insures  that  if  the  sum  does  increase,  the  number  of  infeasibilities  will  decrease,  which  is 
enough  to  prove  convergence. 

After  determining  the  objective  to  minimize  and  solving  for  prices  7r,  MINOS  chooses 
an  incoming  variable  x,  by  computing  the  reduced  costs  of  the  nonbasic  variables  and 
picking  the  variable  with  the  minimum  reduced  cost.  The  reduced  cost  Cj  is  defined  for  x} 
nonbasic  as 


F.  _  l6j  -*T[A  I]j  if  Xj  =  lj; 

1  l  -(cj  ~  I].j)  if  Xj  -  Uj. 


1  l  -(%  ~  *T[A  I).j)  if  Xj  -  Uj. 

It  should  be  noted  that  MINOS  does  not  save  the  values  of  the  reduced  costs  as  they  are 
computed,  since  these  values  are  only  needed  to  determine  the  index  s  of  the  incoming 
variable.  This  facilitates  the  use  of  partial  pricing  schemes,  which  are  discussed  in  Section 
3. 

The  next  step  in  the  simplex  method  is  a  ratio  test.  In  order  to  do  this,  a  direction 
vector  for  modifying  the  current  solution  x  is  determined  by  solving  the  equation  Bp  = 
[A  The  vector  p  is  the  representation  of  the  incoming  column  in  terms  of  the  basis  B. 
The  components  of  x  except  for  x8  and  xt  are  not  changed.  The  dimension  of  the  vector 
of  basic  variables  x8  is  expanded  by  one  to  include  xs,  so  that  x  =  (xB,x,).  The  direction 
of  change  of  x  is  p  =  (p,  t),  where  t  =  1  if  x,  =  u,  and  t  =  —  1  if  x,  =  To  make  the 
ratio  test  easier  to  implement,  let  Bm+i  —  s.  The  ratio  test  determines  the  maximum  0 
such  that  x  —  Op  is  feasible.  Specifically, 


0  =  min 


if  p,  >  0; 


t=l,...,m+l  Ug.-X, 


if  p,  <  0; 


Given  6 ,  the  value  of  the  basic  variables  can  be  changed  using  the  formula  xe  <—  x„  —  0p. 
If  xr  is  the  blocking  variable,  then  Br  and  0m+i  are  interchanged  as  well  as  xr  and  xBrn+x. 
If  r  —  m  +  1,  then  the  variable  x,  has  moved  from  one  of  its  bounds  to  the  opposite  bound, 
and  no  update  of  the  basis  is  necessary. 
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2.4  The  “MinSumlnf”  Ratio  Test 

The  standard  ratio  test  described  above  has  computational  complexity  O(m).  As  an 
alternative  to  this  ratio  test  in  Phase  I,  Greenberg  (1978)  described  a  primal  ratio  test 
which  has  come  to  be  known  as  the  “Dennis  Rarick”  ratio  test,  since  he  implemented  it  in 
WHIZARD  (see  MPS-III  (1975)).  Wolfe  (1961)  described  this  ratio  test  as  an  “extended 
composite  algorithm,”  but  his  use  of  the  word  “composite”  has  a  different  meaning  than 
that  defined  in  the  next  section.  Because  of  the  lack  of  suitable  references,  it  is  not  clear 
from  the  literature  whether  Rarick  or  Greenberg  were  aware  of  Wolfe’s  earlier  work. 

It  is  easy  to  prove  that  the  function  measuring  the  sum  of  the  infeasibilities  of  the 
current  basic  variables  over  the  range  of  the  incoming  basic  variable  x,  is  convex.  The  key 
idea  behind  the  ratio  test  is  to  choose  the  outgoing  variable  associated  with  the  value  of  xa 
that  minimizes  this  function.  Without  loss  of  generality,  we  may  assume  (by  translating 
the  variables  and  reversing  sign,  if  necessary)  that  all  the  variables  have  lower  bounds  /  =  0 
and  upper  bounds  u  =  +oo.  Unrestricted  variables  with  infinite  lower  and  upper  bounds 
are  ignored.  After  selecting  the  incoming  variable  x,  and  finding  the  representation  p  of 
the  incoming  column,  the  value  of  xt  is  increased  from  0  to  some  value  0  while  the  values  of 
the  basic  variables  xB. ,  i  =  1,  . . . ,  m  are  changed  according  to  the  formula  xB.  *—  xB.  —  Op. 

To  simplify  the  notation,  let  /?,  =  xBi,  and  a,  =  p,.  For  9  >  0,  the  function  which 
computes  the  sum  of  the  infeasibility  of  the  basic  variables  can  be  written  as 


m 

S(9)  =  ~  5^  min(0,  fit  -  8oti). 

1=1 

As  a  function  of  8 ,  this  function  is  piecewise  linear  and  convex.  An  example  is  shown 
in  Figure  2.1.  The  function  has  breakpoints  ti  =  ft /a*  where  the  slope  Wi(9)  =  5,'(0) 
changes.  The  “MinSumlnf”  ratio  test  searches  for  the  ratio  ti  that  minimizes  S{8).  Let 
w~  be  the  left  derivative  and  be  the  right  derivative  at  ti.  Note  that  w+  =  tu~+1  for 
t  =  1, . ..  ,m  —  1.  Since  S(9)  is  convex,  w~  <  w+.  The  minimum  of  the  function  occurs  at 
9  =  ti  where  <  0  and  w+  >  0.  If  both  ti  and  tj+i  tie  for  the  minimum,  t,  is  selected. 

Greenberg’s  algorithm  to  minimize  S(9)  involves  sorting  the  sequence  t{.  Under  a  sim¬ 
ple  computational  model,  it  can  be  shown  tnat  at  least  0(n  log  n)  comparisons  are  needed 
to  sort  this  sequence  (Aho,  Hopcroft,  and  Ullman,  1974).  However,  the  methodology  of 
Megiddo  (1983),  who  found  a  linear-time  algorithm  to  solve  linear  programs  when  n  =  2, 
can  be  used  in  the  “MinSumlnf”  ratio  test  in  order  to  develop  a  linear-time  algorithm  to 
find  the  t,  that  minimizes  S(9).  The  latter,  however  is  difficult  to  implement,  and  is  only 
applicable  to  a  two-phase  method.  For  these  reasons,  this  ratio  test  was  not  considered  in 
this  study. 
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infeasibility 
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Figure  2.1  The  function  S(0) 


2.5  Composite  Simplex  Algorithms 

In  most  mathematical  programming  systems  today,  the  simplex  method  is  used  with  the 
Phase  I/Phase  II  procedure  described  earlier.  Ordinarily,  the  Phase  I  linear  program  has 
multiple  optimal  solutions,  since  every  feasible  solution  for  the  original  linear  program  is 
optimal  in  Phase  I.  This  fact  would  lead  one  to  believe  that  the  Phase  II  procedure  could 
begin  at  a  vertex  very  far  away  from  the  optimal  vertex.  This  was  observed  early  on  by 
many  who  worked  on  the  development  of  the  simplex  method;  these  investigators  invented 
numerous  composite  simplex  algorithms  which  combine  in  various  ways  the  reduction  of 
primal  and  dual  infeasibility,  until  zero  is  achieved  for  both.  Essentially,  these  composite 
algorithms  try  to  use  information  about  the  objective  function  cTx  while  finding  an  initial 
feasible  point. 

Finding  an  optimal  solution  of  the  primal  linear  program  involves  among  other  things 
finding  a  feasible  solution  to  the  dual  linear  program.  As  most  of  the  methods  progress,  the 
primal  and  dual  systems  play  a  game  of  “Tug-of-War.”  At  any  primal  infeasible  point,  it  is 
impossible  to  know  whether  the  current  objective  value  is  below  (superoptimal)  or  above 
(suboptimal)  its  optimal  value.  If  the  point  is  suboptimal,  an  improvement  can  often  be 
made  which  will  simultaneously  reduce  the  amount  of  infeasibility  and  decrease  the  value 
of  an  objective  function  (perhaps  going  below  the  optimal  value  of  the  objective).  At  a 
superoptimal  point,  a  reduction  in  ,v:e  amount  of  infeasibility  often  can  only  be  made  by 
increasing  the  value  of  the  objecti\<  function.  However,  some  superoptimal  points  have 
adjacent  vertices  which  can  reduce  infeasibility  and  improve  the  value  of  the  objective 
function.  At  such  a  point,  movement  to  such  an  adjacent  vertex  looks  promising,  but  one 
is  deceived  by  the  local  improvement  criterion  of  the  simplex  method.  In  this  case,  the 
adjacent  vertex  is  improving  the  value  of  the  primal  objective  function  while  increasing 
the  amount  of  dual  infeasibility.  It  is  in  this  sense  that  we  say  that  the  primal  and 
dual  feasibility  conditions  are  “tugging”  at  each  other.  The  various  composite  algorithms 
attempt  to  act  as  the  referee  between  trying  to  satisfy  the  competing  primal  and  dual 
feasibility  conditions. 
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In  the  course  of  making  computational  comparisons  of  various  variants  of  the  simplex  ! 

method,  it  is  important  to  recognize  that  many  factors  can  contribute  to  the  overall  perfor-  ' 

mance.  MINOS  allows  the  user  to  select  from  a  number  of  options,  and  the  ones  used  must  j 

be  carefully  chosen.  In  this  section,  these  options  and  their  effects  on  the  computational  ] 

testing  are  discussed.  In  order  to  reduce  the  amount  of  variability  in  performance  due  I 

to  the  options  chosen,  the  options  used  were  fixed  in  such  a  way  as  not  to  influence  the 
performance  of  any  specific  algorithm  at  the  expense  of  another  competing  algorithm.  In 
this  section,  previous  studies  on  computational  testing  of  the  simplex  algorithm  are  also 
discussed. 

3.1  Practical  Experience  with  the  Simplex  Method 

The  question  of  just  how  efficient  the  simplex  method  is  has  existed  since  its  proposal 
in  1947.  Interestingly  enough,  in  spite  of  the  long  interest  in  its  efficiency,  the  amount 
of  serious  scientific  testing  of  the  algorithm  appears  to  be  quite  small.  Shamir  (1986) 
provides  an  excellent  summary  of  results  that  have  appeared  in  the  literature  pertaining 
to  the  performance  of  the  simplex  method;  the  interested  reader  should  refer  to  his  work 
for  references.  Shamir  states: 

“The  experience  accumulated  during  the  last  three  decades  on  the  behavior  of 
the  Simplex  Method  is  undoubtedly  vast.  However,  there  is  surprisingly  little 
documented  evidence  on  this  experience  in  the  scientific  literature.  One  prosaic 
explanation  may  be  that  practitioners  usually  do  not  keep  record  of  parameters  of 
the  solution  process  for  the  problems  they  solve  (since  they  are  interested  mainly 
in  the  result),  and  if  they  do  they  seldom  report  on  this  record  in  the  scientific 
literature.” 

With  regard  to  many  composite  simplex  algorithms  that  have  been  proposed,  no  systematic 
study  seems  to  exist  in  the  scientific  literature  of  their  efficiencies. 

The  earliest  experiments  with  the  simplex  method  were  performed  by  Hoffman,  Man- 
nos,  Sokolowsky,  and  Wigmann  in  1953.  At  the  time,  they  were  comparing  the  simplex 
method  to  other  non-simplex  methods  for  solving  linear  programs;  they  found  the  simplex 
method  to  be  superior.  Dantzig  has  remarked  that  their  work  provided  the  impetus  to 
perfect  the  simplex  method.  If  their  results  had  not  been  favorable,  it  is  possible  that 
other  methods  would  have  been  perfected  and  the  simplex  method  would  not  be  in  use 
today! 

In  1963,  Wolfe  and  Cutler  experimented  with  nine  “real”  linear  programming  prob¬ 
lems,  which  were  submitted  to  them  by  practitioners  of  linear  programming  at  the  time. 

They  compared  different  pivoting  rules  for  the  simplex  method;  their  conclusion  was  that 
the  original  ‘most  negative  reduced  cost’  rule,  in  spite  of  the  fact  that  it  is  dependent 
on  the  scaling  of  the  variable,  did  as  well  as  the  other  alternatives.  It  may  be  one  of 
the  reasons  why  it  is  used  in  most  codes  today.  The  most-negative  reduced  cost  rule  has 
its  origins  in  problems  with  a  convexity  constraint.  In  such  problems,  an  argument  can 
be  made  for  choosing  this  rule.  Why  it  appears  to  work  well  in  practice  without  such 
a  constraint  is  a  mystery  and  may  be  due  to  practitioners  choosing  units  for  competing 
activities  consistently  (See  Dantzig  (1987)  for  further  discussion).  Since  the  Wolfe-Cutler 
study,  no  serious  computational  comparisons  of  simplex  variants  on  resd-world  problems 
have  appeared  in  the  scientific  literature. 

In  contrast  to  using  “real”  problems  for  testing,  there  has  been  much  work  done 
using  randomly  generated  problems.  In  the  early  sixties,  Kuhn  and  Quandt  (1962)  were 
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the  first  to  conduct  a  series  of  experiments  comparing  the  performance  of  many  different 
pivoting  rules  on  randomly  generated  problems.  Other  authors  followed  with  different 
experiments,  usually  changing  some  aspect  of  the  probabilistic  model.  Because  of  the 
stochastic  nature  of  these  experiments,  the  statistics  computed  can  give  some  insight  on 
the  average  performance  of  different  variants.  But,  as  discussed  later,  it  is  not  clear  that  the 
performance  of  the  simplex  method  on  such  problems  is  representative  of  the  performance 
of  the  simplex  method  on  the  variety  of  real-world  problems  that  are  solved  every  day  by 
linear  programming  practitioners. 

3.2  The  Crash  Basis 

Like  most  linear  programming  systems,  MINOS  provides  the  option  of  finding  a  crash  basis 
as  the  initial  basis  to  start  up  the  simplex  method.  The  key  idea  is  to  rearrange  the  order 
of  the  rows  and  then  select  a  lower  triangular  basis  using  columns  from  the  original  input 
matrix  A.  It  should  be  noted  that  columns  from  the  slack  variables  can  always  be  used 
to  keep  the  initial  basis  lower  triangular.  In  fact,  if  A  were  totally  dense,  then  only  one 
column  would  be  chosen  from  A ,  with  the  rest  of  the  columns  being  chosen  from  the  slack 
variable  matrix  I.  MINOS  also  selects  columns  so  that  each  diagonal  element  of  the  crash 
basis  is  reasonably  large  relative  to  the  other  elements  in  the  same  column,  in  order  to 
prevent  the  initial  basis  from  being  ill-conditioned. 

There  seem  to  be  some  good  reasons  to  begin  with  a  crash  basis,  as  opposed  to  the 
basis  consisting  entirely  of  slack  variables.  To  understand  these  reasons,  the  concept  of 
Hamming  distance  is  useful.  Let  J\  and  J2  be  two  sets  of  indices  of  basic  variables.  Then 
the  Hamming  distance  is  defined  as 

h{Ji,J2)  =  \Ji  uJ2|  -  \J\  n  J2\. 

In  other  words,  h{  Jx,  J2)  is  the  minimal  number  of  pivot  steps  (ignoring  feasibility)  required 
to  move  from  the  basis  corresponding  to  Jj  to  the  basis  corresponding  to  J2.  It  provides 
a  useful  lower  bound  on  how  close  a  specific  basis  is  to  another  basis,  in  terms  of  number 
of  pivots  of  the  simplex  method. 

In  most  linear  programs,  the  number  n  of  original  columns  is  larger  than  the  number 
m  of  rows.  Hence,  if  all  variables  (decision  and  slack)  have  equal  probability  of  being  in 
a  specific  basis  B,  the  probability  of  the  eth  basic  variable  in  the  basis  B  being  a  decision 
variable  (as  opposed  to  a  slack  variable)  is  Hence,  the  expected  number  of 

decision  variables  in  the  optimal  basis  should  be  larger  than  the  expected  number  of  slack 
variables  in  the  optimal  basis.  Therefore,  choosing  a  crash  basis  from  the  set  of  original 
decision  variables  should  keep  the  initial  Hamming  distance  between  the  crash  basis  and 
the  optimal  basis  lower. 

On  the  other  hand,  if  we  consider  the  probability  of  a  certain  variable  x}  being  in 
the  optimal  basis,  with  all  bases  being  equally  likely,  a  different  analysis  is  required.  Since 
there  are  (n^m)  possible  bases,  and  (n+™-1)  bases  which  do  not  include  x}.  the  probability 
of  x}  not  being  in  a  random  basis  is  Hence,  the  likelihood  of  x}  being  in  a  randomly 

chosen  basis  is  — l—  <  j.  So  choosing  a  specific  x}  to  be  in  the  crash  basis  may  be 
disadvantageous. 

However,  many  slack  variables  correspond  to  equality  rows,  and  therefore  cannot 
be  in  the  optimal  basis  unless  redundancies  among  the  constraints  exist,  or  in  certain 
degenerate  situations.  Furthermore,  in  formulating  linear  programs,  there  is  a  tendency 
to  write  inequality  constraints,  a  good  proportion  of  which  are  expected  to  be  tight  at 
the  optimal  solution  so  that  their  corresponding  slack  variable  will  most  likely  not  be  in 
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the  optimal  basis.  On  the  other  hand,  if  the  problem  has  many  constraints  which  are  not 
binding  in  the  optimal  solution,  then  the  optimal  basis  may  contain  more  slack  variables 
than  decision  variables.  In  most  problems,  the  former  seems  to  be  the  case.  With  these 
considerations,  all  the  experiments  in  this  dissertation  were  done  using  the  CRASH  BASIS 
option  of  MINOS  to  initiate  the  various  algorithms. 

3.3  Scaling 

MINOS  offers  a  SCALE  option  when  solving  a  linear  program.  This  option  will  take  the 
original  input  of  the  problem  and  attempt  to  make  the  coefficients  of  each  column  and  row 
as  close  to  1  as  possible.  This  is  done  by  choosing  two  nonsingular  diagonal  matrices  R  and 
5  to  respectively  multiply  the  rows  and  columns  of  the  problem.  The  values  of  the  elements 
of  A ,  e,  /,  and  u  are  modified.  The  algorithm  used  in  MINOS  is  described  by  Fourer  (1982). 
The  most  immediate  advantage  of  using  the  SCALE  option  is  in  controlling  any  numerical 
difficulties  that  may  occur  in  a  specific  problem.  Scaling  is  theoretically  attractive  because 
it  makes  certain  selection  rules  scale  free.  Appealing  as  it  is  on  theoretical  grounds  to  be 
scale  free,  nevertheless,  it  is  not  clear  how  scaling  affects  the  performance  of  the  simplex 
algorithm  or  its  variants.  This  is  apparent  from  the  computational  results  in  this  study. 

Badly  scaled  linear  programs  seem  to  be  an  evil  that  those  who  develop  software  to 
solve  practical  problems  must  contend  with.  Quite  often  practical  models  contain  con¬ 
straints  that  implicitly  must  convert  the  units  used  in  one  constraint  to  the  units  used 
in  another.  For  example,  if  Xi  is  a  variable  whose  unit  is  millions  of  dollars,  and  x2  is  a 
variable  whose  unit  is  dollars,  then  a  constraint  that  adds  x,  and  x2  would  be  similar  to 

x,  +  10*x2  <  10. 

It  is  clear  that  such  constraints  can  cause  bad  problems  with  regard  to  numerical  accuracy. 
Scaling  often  helps  remove  some  of  these  effects. 

It  is  easy  to  see  that  scaling  a  problem  can  change  the  path  of  the  simplex  algorithm.  If 
a  column  j  is  multiplied  by  p ,  then  the  reduced  cost  of  that  column  will  be  pcy  Therefore, 
the  choice  of  incoming  column  used  by  the  'most  negative  reduced  cost  rule'  could  change 
if  the  units  of  a  variable  were  changed  by  the  user  at  the  time  of  formulating  his  linear 
program. 

The  scaling  algorithm  is  executed  before  a  crash  basis  is  determined.  Scaling,  of 
course,  does  not  affect  the  sparsity  pattern  of  a  linear  program.  Because  the  numerical 
values  in  the  constraint  matrix  A  influence  the  choice  of  columns  for  the  crash  basis,  scaling 
can  change  the  starting  basis  and  therefore  comparing  the  results  of  solving  with  the  SCALE 
YES  option  with  the  results  of  solving  with  the  SCALE  10  option  will  not  isolate  the  effects 
of  scaling  alone.  If  the  two  crash  bases  generated  are  labeled  as  the  “unsealed  basis"  and 
the  “scaled  basis,”  these  bases  can  be  used  as  either  the  starting  basis  for  either  the  scaled 
or  unsealed,  problem,  respectively.  Hence,  for  a  specific  problem  and  a  specific  algorithm, 
four  variations  can  be  considered,  and  the  experience  gained  from  running  each  algorithm 
under  all  of  the  variations  can  be  used  to  compare  the  effects  of  scaling  by  removing  any 
variability  in  the  different  starting  bases  For  notational  convenience,  the  unsealed  crash 
basis  will  be  notated  as  UB  and  the  scaled  crash  basis  will  be  notated  as  SB  In  all 
problems  except  AFIRO,  these  bases  were  different 

In  order  to  generate  the  scaled  basis  for  the  unsealed  problem  and  the  unsealed  basis  for 
the  scaled  problem,  the  two  initial  bases  for  each  problem  were  saved  using  the  PUNCH  option 
of  MINOS.  When  each  problem  was  solved,  the  scale  option  was  set  m  the  specifications 
file,  and  the  appropriate  crash  basis  was  read  in  using  the  INSERT  option  When  the  I'B 
basis  was  used  with  the  unsealed  problem,  and  the  SB  basis  with  the  scaled  problem. 
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the  bases  were  not  read  in  from  their  files,  but  were  generated  by  the  code  at  runtime. 
As  explained  later,  the  extra  CPU  time  for  these  extra  computations  was  recorded  and 
adjustments  made  to  correct  the  running  time  statistics  generated  by  the  experiments. 

3.4  Tolerances 

If  a  linear  program  is  solved  using  rational  arithmetic,  the  solution  will  satisfy  the  equations 
[A  I]x  =  0  and  /  <  x  <  u  exactly.  Furthermore,  the  conditions  for  optimality  of  the 
□onbasic  variables  c}  >  0  will  also  hold.  However,  MINOS  uses  floating-point  arithmetic, 
and  it  is  not  practical  to  obtain  exact  solutions  to  the  primal  and  dual  systems.  It  is 
necessary,  therefore,  to  specify  tolerances  for  infeasibility  of  the  problem. 

MINOS  provides  a  FEASIBILITY  TOLERAICE  option.  If  tf  is  this  tolerance,  then  a 
solution  x  is  said  to  be  feasible  if 

l)  —  tf  <  x}  <  u,  +  tf 

for  each  j  =  1,  . . . ,  m  +  n.  Since  these  equations  include  the  slack  variables,  the  linear 
constraints  are  also  satisfied  to  this  tolerance  tf.  For  MINOS,  the  default  feasibility  to! 
erance  is  tf  =  10~*.  For  all  of  the  experiments  done,  the  default  feasibility  tolerance  was 
used. 

A  second  tolerance  is  the  OPTIMALITY  TOLERAICE.  This  is  related  to  the  dual  feasibility 
of  the  problem.  If  tj  is  this  tolerance,  then  the  solution  is  declared  optimal  if 


(To  av<  J  division  by  zero,  the  value  in  the  denominator  is  actually  max(  1 .  )  Dividing 

by  th-  lorm  of  ir  makes  the  test  independent  of  any  scaling  of  the  objective  function  The 
defa'  value  of  the  optimality  tolerance  is  =  10~®  This  value  was  used  in  all  the 
computational  tests. 

The  third  tolerance  that  is  important  when  solving  linear  programs  is  the  PIVOT 
TOLERAICE.  This  tolerance  is  used  to  prevent  columns  from  entering  the  basis  if  the\ 
would  cause  the  basis  to  become  almost  singular  If  this  tolerance  is  denoted  t,,.  then  a 
row  i  in  the  ratio  test  will  be  rejected  if 


!/V  S  (p 

I  he  default  pivot  tolerance  for  MINOS  is  tp  —  fJ/1  and  was  used  throughout  all  the  testing 
Here  f  is  the  measure  of  machine  precision;  fr  was  approximated  111  11  on  the  ma<  hum 
used  m  the  experiments 

In  order  to  maintain  the  “equality"  of  the  linear  constraints  .Air-  (I.  a  numern  al 
test  is  done  on  every  lr,h  iteration,  where  it  i»  the  CHECK  FREQUENCY  If  the  largest  com 
ponent  of  the  residual  vector  r  =  j.4  I  i  is  determined  to  be  too  large  ias  determined  !>\ 
tj  and  |lj|j.  a  larger  error  is  allowed  if  j) j* ji  is  larger  then  the  basis  is  refart onzed  am!  the 
basu  variables  are  recomputed  by  a  spec  lal  a  Igor  it  bin  to  sat  isf\  the  linear  constraints  more 
accurately  I  he  default  value  of  k  -  111  was  used  m  ali  test-  in  this  dissertation  In  each 
test  the  check  never  caused  a  refac  tori/at  loti 
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3.5  Pricing 

In  determining  a  variable  i,  to  enter  the  basis,  the  simplex  algorithm  chooses  a  such  that 
ct  is  minimal.  Since  c  is  not  needed  elsewhere  in  the  execution  of  the  algorithm,  it  is 
efficient  to  compute  c,  for  each  nonbasic  x.  for  j  =  1,  . . . ,  n  +  m,  while  keeping  track 
of  the  minimum  c}  found  so  far.  This  combined  procedure  is  known  as  pricing  out  the 
nonbasic  variables.  The  fundamental  calculation  is  to  compute 

9,  =  Cj  -  wt[A  I]., 

for  each  column.  If  x}  =  l},  then  c.  =  g} ;  otherwise,  x}  =  u}  and  c}  =  —  g}.  If  c}  >  0  for 
all  j,  then  the  current  solution  to  tne  linear  program  is  optimal. 

The  expense  of  pricing  a  specific  column  is  directly  related  to  the  density  of  that 
column.  If  Xj  is  a  slack  variable,  then  g}  =  — If  the  jtk  column  in  A  has  k}  nonzero 
elements,  then  only  k}  multiplications  are  needed  to  compute  g}.  To  actually  compute  .<* 
such  that  c,  is  minimal  requires  one  to  price  out  a 11  the  columns  of  A.  For  convergence  of 
the  simplex  algorithm,  any  a  will  do  provided  c,  <  0  on  each  iteration.  It  has  been  observed 
empirically  that  sweeping  through  the  j  =  (1, .  .  .  ,n)  in  batches  and  optimizing  over  each 
batch  can  reduce  CPU  time  dramatically  in  many  applications.  Hence,  this  scheme,  called 
partial  pricing,  is  used  in  most  commercial  linear  programming  systems.  In  particular,  the 
PARTIAL  PRICE  option  of  MINOS  allows  the  user  to  specify  a  value  p  so  that  the  m  -f  n 
columns  for  the  decision  and  slack  variables  are  divided  equally  into  p  batches,  Ak  and  Ik 
for  k  =  0,  .  .  . ,  p  -  1.  A  tolerance  parameter,  ij  >  tj,  is  set  for  each  sweep  through  each 
group,  and  it  is  reduced  dynamically  on  each  iteration.  If  the  previous  iteration  found  an 
ij  such  that  Cj  <  0  for  some  j  in  the  klk  group,  then  the  next  group  searched  is  .4*+1. 
7*+i  (When  k  4-  1  ~  p,  then  the  next  group  searched  is  A0,  /<>).  If  no  j  is  found  in  this 
group  such  that  (c,/||*||)  <  -ij,  then  the  next  group  (F  +  2)  (with  provisions  for  recycling 
back  to  zero)  is  searched.  If  all  groups  are  searched  and  no  candidate  is  found,  tj  >  tj  is 
reduced,  unless  ij  is  already  as  small  as  tj,  in  which  case  the  current  solution  to  the  linear 
program  is  declared  optimal. 

The  partial  pricing  scheme  described  above  will  most  likely  reduce  the  amount  of 
pricing  out  and  hence  the  average  amount  of  computation  done  per  iteration,  but  may 
increase  the  number  of  iterations  necessary  to  find  the  optimal  solution,  because  the  “best 
candidate”  (in  terms  of  minimal  c;)  may  not  always  be  chosen  on  each  iteration  There 
is.  therefore,  a  tradeoff  that  the  user  of  a  linear  programming  system  must  make,  and  it 
is  not  clear  what  rule  one  should  use  to  choose  the  value  of  p  In  the  MINOS  manual,  it 
is  recommended  that  for  time-stage  models  with  t  time  periods,  the  user  should  choose 
p  -  t  The  example  below  shows  that  this  may  not  always  be  optimal 

The  problem  HITACHI  was  described  as  a  48  period  model  by  Nishiya  and  Kuriabashi 
(1984)  Brady  (1986)  wrote  a  program  and  used  it  to  generate  an  MI’S  deck  for  this  mode! 
MINOS  5.0  was  then  used  to  solve  this  problem  on  an  IBM  3081  using  different  partial 
price  options  The  size  of  the  problem  was  quite  large,  with  m  =  1008.  u  =  1632  and  174! 
nonzero  elements  in  .4  All  the  options  of  MINOS  were  set  to  their  default  mode  except 
the  PARTIAL  PRICE  option,  which  was  varied  from  p  =  1  to  p  =  4>  I  he  results  are  shown 
in  Table  3  1 

It  is  interesting  to  note  that  as  p  increased  from  1  to  8.  the  number  of  iterations  and 
the  total  CPU  time  decreased  For  p  >  8.  both  iterations  and  CPU  time  increased  with  p 
except  for  p  =  16  The  outlver  at  p  -  16  indicates  that  it  is  difficult  to  predu  t  the  result- 
of  varying  p  It  is  also  interesting  to  note  that  while  HITACHI  has  4*  time  periods.  /. 
was  not  the  best  choice 
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[«wfe|Mfrir| 

HEISTS 

Iterations 

l 

2 

791 

4 

781 

6 

”7ffl  ‘ 

8 

HJ 

18.77 

12 

WEsm 

19.17 

16 

799 

18.93  " 

24 

822 

19.45 

48 

879 

Table  3.1  Results  for  HITACHI  problem 


This  example  demonstrates  (and  this  is  true  for  many  others)  that  how  to  choose  p  to 
minimize  the  running  time  of  solving  a  linear  program  is  not  obvious.  Hence,  for  all  other 
results  in  this  study,  the  PARTIAL  PRICE  option  of  MINOS  was  not  used,  and  the  default 
choice  of  p  —  1  was  set  by  the  program.  In  later  sections,  there  will  be  discussion  on  how 
partial  pricing  could  possibly  be  used  to  reduce  the  running  time  of  different  algorithms 
implemented  herein. 

3.6  Random  versus  Real-World  Problems 

When  comparing  algorithms  for  linear  programming,  it  is  necessary  to  determine  whether 
randomized  or  real  world  problems  will  be  used.  l:p  to  now,  no  one  has  been  able  to 
create  a  random  model  that  generates  problems  that  are  representative  and  as  varied  as 
those  encountered  in  practice  A  number  of  characteristics  seem  to  be  inherent  in  real 
world  problems,  but  are  difficult  to  capture  and  model  stochastically  We  can,  however, 
comment  about  some  of  their  properties 

1  Sparsity  Most  real- world  problems  are  sparse  Typically,  they  are  sparse  because  if 
they  were  not.  they  never  would  be  formulated  it  would  require  too  much  effort  on 
the  part  of  the  model  formulator  to  collect  the  data  for  a  dense  matrix  For  example, 
specifying  100  percent  of  a  1000  by  2000  dense  matrix  would  require  the  input  of  2 
m  llion  nonzeros  Thus  sheer  data  collection  effort  on  the  part  of  the  user  seems  to 
be  the  main  reason  that  linear  programs  that  we  encounter  in  practice  are  sparse 
Another  reason  is  that  even  if  dense  models  were  formulated,  they  probably  could  not 
be  solved  bparsitv  is  usually  measured  bv  the  density  d  =  —  of  the  m.n  matrix  A. 

•  tfi  t, 

which  is  assumed  to  have  t  nonzero  entries 

2  Structure  and  near  triangularity  of  the  basis  In  problems  where  submat  rues  corre 
spond  to  interactions  between  two  different  set*  of  technologies  mam  of  the  te<  hnolo 
gies  are  independent  and  hem  e  the  <  orrespotiding  elements  are  zero,  yielding  large 
spars*-  bhnks  in  A  Indeed,  there  appear  to  lie  a  great  \<iriet\  of  structures  ennmn 
tered  m  the  sparsity  patterns  in  real  woild  problem'  1  hi'  i>  apparent  in  time  staged 
and  multi  staged  linear  programs,  whnh  are  often  des.  nhe.i  stain  a.se  in  strmture 

I  h**re  ma\  be  spe<  ial  st  r  u<  t  u  res  within  the  bio.  k'  of  a  si  i  p  as*-  si  pi,  t  ure  Net  works 
are  another  general  <  lass  I  lies*-  also  may  h a\e  spars;*\  patterns  within  the  network 
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which  are  specially  structured.  One  approach  to  solving  large-scale  linear  programs 
is  to  exploit  special  structures,  as  has  been  done  by  Fourer  (1982).  A  difficulty  with 
this  approach  is  that  there  appear  to  be  enough  slight  differences  in  structure  from 

firoblem  to  problem  that  no  characteristic  pattern  among  them  is  readily  apparent, 
n  the  course  of  solving  a  linear  program  by  the  simplex  method,  one  can  usually 
rearrange  the  rows  and  columns  of  the  various  bases  encountered  so  that  they  are 
nearly  triangular.  Near  triangularity  makes  it  a  relatively  inexpensive  operation  to 
represent  the  basis  as  a  product  of  lower  and  upper  triangular  matrices  and  preserves 
much  of  the  original  sparsity  from  iteration  to  iteration. 

3.  Degeneracy.  Random  models,  with  probability  one,  do  not  have  degenerate  vertices, 
while  most  real-world  problems  typically  have  many  degenerate  vertices  Recent  work 
by  Krueger  (1986)  has  analyzed  the  effects  of  degeneracy  in  random  polvhedra.  when 
such  degeneracy  can  be  included  among  the  randomly  generated  models  He  found 
that  as  n  increases,  the  limiting  behavior  of  the  average  performance  of  the  simplex 
method  does  not  change  when  the  number  of  degeneracies  and  m  are  fixed  Degeneracy 
occurs  in  practice  because  modellers  tend  to  overspecify  a  problem,  thereby  generating 
redundant  constraints.  Furthermore,  models  tend  to  have  constraints  that  convert 
one  variable  into  a  sum  of  other  variables,  this  can  cause  degenerac  y.  For  example, 
degeneracy  can  occur  in  practice  because  modellers  include  technologies  which  have  a 
number  of  activities  wholly  dependent  on  whether  a  particular  technology  is  used  or 
not.  These  can  have  constraints  linking  them  If  the  technology  is  not  used,  each  of 
these  constraints  will  generate  a  basic  variable  with  value  zero 
4  Unit  Elements.  Many  of  the  nonzero  elements  m  the  constraint  matrix  .4  of  most 
real  world  problems  have  values  of  +1  or  1  This  often  happens  because  many  of 
the  constraints  are  “bookkeeping"  constraints  that  a  modeller  puts  in  to  find  the  sums 
of  certain  variables  If  x,  represents  the  amount  invested  m  some  ac  tivity,  then  some 
constraint  will  add  jj  to  its  sum  while  another  constraint  will  subtract  i|  from  its 
sum.  The  column  will  then  have  two  non/eros.  1  l)  and  1  0.  winch  occur  in  the 
rows  corresponding  to  inputs  and  outputs  that  r(  affects  Such  relations  are  fairly 
common  in  models  and  are  used  to  summarize  the-  results  of  models  m  various  wavs 
Spreadsheets  are  very  popular  because  they  express  the  same  kind  of  relations  in  a 
convenient  format  of  two  way  tables  that  allow  easy  horizontal  ami  vertic  al  addition 
5  Feasibility  Most  models  for  randomly  generating  problems  are  deliberately  primal 
feasible  leg.  Kuhn  and  Quandt  il'M.Jii  If  a  problem  is  given  in  primal  feasible 
canonical  form,  then  the  Phase  II  simplex  algorithm  applies  while-  if  a  problem  is 
initially  given  in  dual  feasible  canonical  form  the*  duai  simplex  algorithm  applies 
Composite  simplex  algorithms  are  most  applicable  t<>  the  c  a*»  w  here  the-  problem  is 
both  primal  and  dual  infeasible  in  Ms  given  canonical  form 

With  these  c  onsiderat  ions  as  bac  kround  it  was  decided  to  do  ail  te»tmg  on  iJ  real 
world  problems  I  hese  problems  were-  •  c ■! !•-«  t »-• ;  from  various  viihh-.  t>\  the  'sv  stems  Op 
tumzat ions  Laboratory  in  t he  Depai t ment  of  Opeiat  ion*  |(f-M-,in  >.  at  vtant".,'t  I  mversitv. 
and  have  been  available  for  inaiiv  veai  •  i  ne  pioiciem*  tang*  m  ■ . .  •  :  i  <  • : . .  *ma,  t«  ■  medium, 
and  arc  believed  to  be  representative  .!  ..r.ear  pr<.g:am*  ’ : . .  i  *  at*  most  practi 

t  loners  All  the  problems  have  ,  c,,,i(  !..»*c*  geneiat*  d  M  l'>  <  >"  a-  p’.mal  and 

dual  infeasible  I  he  statistic  *  as  to  t  .  to;  •  1  <  'Ac.,  p: .  -  a'  •  'how  n  m 

I  able  1  2 
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Problem 

Size 

Name 

i saa 

■Kjj 

88 

mm igj 

ADUTTLE 

56 

97 

465 

8.56 

SHARE2B 

99 

79 

10.25 

■■ULM 

4.45 

BEACONFD 

174 

262 

3476 

7.62 

ISRAEL 

175 

142 

2358 

9.49 

*““235 

fgggBtfKlJ 

1.91 

E226 

226 

282 

4.77 

CAPRI 

272 

353 

1786 

1.86 

lz umamm 

STAIR 

357 

467 

3857 

2.31 

ETAMACRO 

401 

688 

2489 

0.90 

Table  3.2  Problem  Statistics 


3.7  Tatting  Methodology 

Except  for  the  earlier  reported  tests  on  partial  pricing,  all  computational  tests  in  this 
study  were  done  on  a  Digital  Equipment  Corporation  VaxStation  II  with  2  megabytes  of 
main  memory.  The  operating  system  was  MicroVMS  version  4.3,  and  the  VAX  Fortran 
Compiler,  version  4.1,  was  used  with  the  default  options  (which  includes  an  optimizer).  All 
tests  on  a  particular  problem  were  run  as  a  batch  job  to  eliminate  the  effect  of  differences 
in  time  of  terminal  input /output  on  the  results. 

The  algorithms  were  implemented  using  MINOS  version  5.0  as  a  framework,  using 
as  modules  for  the  various  algorithms  the  same  MINOS  subroutines  whenever  possible, 
in  order  to  minimize  any  distortion  of  results  due  to  special  routines  written  specially  for 
each  algorithm.  All  the  options  of  MINOS  were  set  at  their  defaults.  For  each  specific 
algorithm,  four  runs  were  made.  These  were  classified  as  to  whether  the  problem  was 
unsealed  or  scaled,  and  whether  the  initial  crash  basis  was  the  UB  crash  basis  or  the  SB 
crash  basis. 

A  special  routine  was  written  to  act  as  a  timer,  which  used  a  VMS  routine  that  returns 
the  CPI'  time  used  in  centiseconds.  Hence,  all  timing  results  are  accurate  to  at  most  a 
hundredth  of  a  second.  Mainly  timed  were  the  execution  of  the  entire  code,  including 
input  of  the  data  and  output  of  the  results,  and  the  execution  of  the  MINOS  subroutine 
M5SOI.V  This  subroutine  takes  as  input  a  linear  program  in  memory,  and  outputs  in 
memory  the  indices  of  the  optimal  basis.  All  algorithms  were  implemented  as  subroutines 
of  this  main  solving  routine.  The  CPC  time  of  this  central  routine  is  used  to  compare  the 
speeds  of  each  algorithm  Since  each  algorithm  was  a  variant  of  the  simplex  method,  the 
number  of  simplex  iterations  was  also  recorded 

Average  timing  statistics  were  also  collected  for  the  following  operations:  refactorizing 
the  basis,  ratio  tests,  pricing  out.  updating  the  factorization  of  the  basis,  solving  for  x. 
solving  for  the  representation  of  the  incoming  column,  and  updating  the  current  solution 
i  I  he  CPI  time  required  to  collect  these  statistics  was  negligible  compared  to  the  total 
exeiution  time  of  the  routine  M5SOI.V,  and  did  not  influence  the  results. 
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It  should  be  noted  that  a  re-run  of  the  same  algorithm  on  the  same  problem  can 
produce  slightly  different  CPU  times,  but  never  a  difference  in  iteration  counts.  This  can 
happen  because  after  every  iteration,  MINOS  prints  one  line  to  an  output  file  on  a  disk 
to  record  information  about  that  iteration.  The  time  to  perform  this  operation  can  vary 
slightly,  depending  upon  the  position  of  the  read/write  heads  of  the  disk  drive.  Because  of 
the  nature  of  the  VMS  operating  system,  it  is  impossible  to  recreate  exactly  the  conditions 
of  the  disk  drive  prior  to  each  run  of  the  solver.  I  assumed  that  this  source  of  error  was 
small  and  could  be  ignored  in  comparing  the  CPU  times  of  various  algorithms. 

For  each  algorithm,  four  tables  that  correspond  to  one  of  two  crash  bases  and  one  of 
two  scaling  options  will  be  presented  in  the  sections  that  follow.  The  vertical  ordering  of 
the  results  will  be  by  the  increasing  number  of  rows  in  each  problem.  Corresponding  to 
the  four  tables,  the  effects  of  scaling  for  the  algorithm  will  be  analyzed.  The  four  tables  for 
each  composite  algorithm  will  be  compared  to  the  four  tables  in  the  next  section.  Section 
8  contains  a  comparitive  study  of  the  entire  set  of  algorithms. 

3.8  Results  for  the  Primal  Simplex  Algorithm 

For  the  twelve  problems,  the  standard  two-phase  procedure  described  in  Section  2  was  run. 
The  results  for  this  algorithm  are  presented  in  Tables  3.4-3. 7.  Table  3.3  below  summarizes 
these  results  for  the  effects  of  scaling  and  the  crash  basis  on  the  two-phase  algorithm. 
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Table  3.3  Two-Phase  Algorithm — Comparison  of  Scaling  versus  Nonscaling 


For  any  pair  of  tables,  the  twelve  problems  were  compared  and  the  number  of  “wins”  and 
“losses”  were  counted.  The  entry  W/L  in  Table  3.3  corresponds  to  the  number  of  wins 
and  losses,  respectively.  A  win  W  is  scored  for  each  problem  when  the  combination  shown 
on  the  left  margin  of  the  table  had  less  CPU  time  and  less  iterations  than  the  one  shown 
on  the  top  margin  of  the  table.  A  loss  L  is  similarly  scored.  If  either  the  CPU  time  was 
equal  or  the  number  of  iterations  was  equal,  or  the  number  of  iterations  was  higher,  but 
the  CPU  time  was  less,  neither  a  win  or  a  loss  was  credited.  Hence,  W  +  L  ^  12  in  all  of 
the  cases.  There  are  three  possible  explanations  why  the  latter  inconsistency  occurred: 

1.  There  are  slight  inaccuracies  in  tne  timing  mechanism  (e.g.,  AFIRO); 

2.  The  number  of  iterations  is  close,  but  the  simplex  paths  taken  are  slightly  different, 
with  one  path  using  somewhat  sparser  LU  factorizations  of  the  basis  (e.g.,  BRANDT  ); 
and, 

3.  The  number  of  iterations  is  close,  but  more  time  is  spent  in  Phase  II,  where  the 
computation  of  tt  is  more  expensive,  since  cB  is  more  dense  in  Phase  II  than  in  Phase 
I  (e.g.,  ETAMACRO). 

It  seems  that  there  is  no  clear  cut  advantage  to  choosing  scaling  or  one  of  the  two  different 
starting  crash  bases.  There  is  a  slight  advantage  to  using  the  SCALE  option  of  MINOS 
(scaled  with  the  SB  basis)  as  opposed  to  not  using  that  option  (unsealed  with  the  UB 
basis). 
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Table  3.4  Two-Phase  Algorithm — Unsealed  with  UB  Basis 


Table  3.5  Two-Phase  Algorithm — Scaled  with  UB  Basis 
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Table  3.6  Two-Phase  Algorithm — Unsealed  with  SB  Basis 
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Section  4.  The  Weighted  Objective  Method 


In  this  section,  the  weighted  objective  method  is  discussed.  The  algorithm  is  similar  to 
the  classical  big-M  method  and  absolute  value  penalty  methods  of  nonlinear  programming. 
These  similarities  are  noted  and  the  implementation  of  the  algorithm  is  discussed  as  well. 

4.1  The  Weighted  Objective  Method:  Algorithm 

The  main  idea  behind  the  weighted  objective  method  is  to  create  a  new  linear  program 
combining  the  objectives  used  in  Phase  I  and  Phase  II  in  the  standard  procedure.  The 
Phase  I  objective  cFx  is  combined  with  the  Phase  II  objective  cTx  to  form  a  weighted 
combination  z  =  dTx  +  crc7!,  which  is  then  minimized.  If  the  new  objective  has  an  optimal 
solution  that  is  infeasible  for  the  original  problem,  then  o  is  decreased  by  a  factor  of  10 
and  the  method  is  continued.  If  a  is  reduced  by  a  factor  of  10  five  times  (a  MINOS 
default)  and  a  feasible  solution  has  not  been  reached,  then  a  is  set  to  0,  and  the  usual  two- 
phase  approach  is  resumed.  If  the  weighted  linear  program  remains  infeasible  after  a  105 
reduction,  it  is  most  likely  the  original  linear  program  is  infeasible  or  o  was  initially  chosen 
much  too  large.  The  usual  ‘most  negative  reduced  cost  rule’  is  used  to  choose  the  incoming 
variable.  The  algorithm  that  follows  looks  very  similar  to  the  two-phase  procedure,  with 
the  major  differences  being  the  way  c  is  defined  and  in  the  scheme  for  reducing  o. 

procedure  weighted-objective-, 
finished  «—  false; 
count  «—  5; 

while  not  finished  do 
begin 

if  needs -faciorization(B)  then  factorize(B ); 
if  infeasible  -basis  (B)  then  c  «—  d  -f  oc  else  c  *—  c; 


FindPI:  Solve  tttB  =  ce  for  7r; 


PriceOut:  Find  s  such  that  xt  has  minimum  reduced  cost  c,\ 
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begin 

infeasible  «—  false; 
if  count  >  1  then 
begin 

count  *—  count  —  1; 

<r  *-  <r/ 10  ; 

end 

else  <7  4—0; 
end; 

finished  «—  optimal  or  unbounded  or  infeasible ; 
end; 

MINOS  provides  a  WEIGHT  OH  LINEAR  OBJECTIVE  option  that  allows  the  user  to 
choose  the  initial  value  of  <7.  It  is  not  clear  what  heuristics  a  user  should  use  when  choos¬ 
ing  cr.  In  their  research  on  the  barrier  method,  Gill  et  al.  (1986)  found  that  a  choice  of 
cr  =  .l/||c||  was  promising  for  that  method.  The  same  value  was  used  for  the  computational 
tests  contained  herein. 

4.2  Th«  Big-M  Method 

The  big-Af  method  is  very  similar  to  the  weighted  objective  method.  In  its  simplest  form, 
a  linear  program  with  equality  constraints  is  given: 

minimize  cTx 

subject  to  Ax  =  6,  (4.1) 

x  >  0. 

By  multiplying  equation  »  by  — 1  if  <0,  it  transforms  this  linear  program  so  that  6,  >  0 

for  all  i.  A  set  of  artificial  variables  u  is  added  and  a  new  objective  is  formed: 

minimize  cTx  +  MeTu 

subject  to  Ax  ■+•  7u  =  b,  (4.2) 

x,  u  >  0, 

where  tr  —  (1, 1, . . . ,  1).  If  M  is  a  very  large  number,  then  minimization  of  this  form  will 
be  equivalent  to  the  textbook  form  of  Phase  I  of  the  simplex  method.  (A  slightly  different 
form  is  used  in  MINOS).  If  M  is  given  a  specific  value,  then  this  method  is  equivalent  to 
the  weighted  objective  method. 

The  origins  of  this  method  are  not  clear.  Hadley  (1962)  refers  to  the  method  as 
“Charnes’  — M  method.”  The  earliest  work  by  Charnes  using  the  notation  of  M  appears 
in  Charnes,  Cooper,  and  Henderson  (1953).  There,  Charnes  credits  Dantzig’s  first  paper 
on  the  simplex  method  (1951)  with  the  idea  that  became  the  big -M  method.  There, 
Dantzig  referred  to  assigning  “small”  weights  to  the  infeasibility  objective.  (Since  his 
linear  program  was  a  maximization  problem,  the  “small”  must  have  been  in  the  sense  of 
being  initially  very  negative.)  There  also  seems  to  be  an  independent  development  which 
used  the  notation  u;  for  M  as  used  above.  This  appears  in  the  works  of  Orden  (1951)  and 
Mayberry  (1951),  who  credit  Dantzig  with  the  idea.  Orden  worked  closely  with  Dantzig 
in  the  Pentagon  around  1950  and  the  idea  may  have  been  informally  suggested  to  Orden 
by  Dantzig.  In  the  implementation  of  the  method,  it  is  not  clear  whether  M  is  to  be  used 
symbolically,  or  whether  M  was  actually  a  preassigned  a  fixed  value.  The  big-M  method 
has  been  reported  to  have  been  quite  successful  when  a  model  with  minor  changes  is  solved 
repeatedly  and  the  value  of  M  (experimentally  arrived  at)  is  not  too  large. 
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4.3  Absolute  Value  Penalty  Methods 

Gill  et  al.  (1981)  describe  the  use  of  absolute  value  penalty  functions  when  applied  to 
nonlinear  programs  with  nonlinear  constraints.  When  applied  to  the  linear  program  (4.1), 
this  method  becomes  equivalent  to  the  weighted  objective  method. 

Given  the  nonlinear  program 

minimize  F(x ) 

subject  to  c,(x)  =  0,  i  =  l,...,t,  (4.3) 

the  absolute  value  penalty  function  is  defined  as 


PH(x,p)  =  F(x)  +  p^|c'j(x)|,  (4.4) 

i=i 

for  some  scalar  p  >  0.  If  one  lets 

Ci(x)  =  bi  -  of*  =  Ui,  (4.5) 

where  u  represents  the  artificial  variables  of  the  big -M  method,  then 

t 

Pa{x,p)  =  cTx  +p^u,.  (4.6) 

i=i 

This  is  the  same  function  used  by  the  big-M  method  with  p  —  M.  Hence,  the  absolute 
value  penalty  function,  when  applied  to  linear  programs,  yields  the  weighted  objective 
method. 

4.4  Implementation  and  Computational  Results 

The  weighted  objective  method  is  quite  simple  to  implement.  Since  the  objective  vector 
c  is  stored  as  part  of  the  matrix  A  in  MINOS,  a  pass  must  be  made  through  A  in  order 
to  compute  |jc||.  MINOS  provides  the  mechanism  for  reducing  o  if  the  current  basis  is 
optimal  for  the  weighted  objective  but  still  infeasible.  Hence,  there  is  a  small  amount  of 
extra  work  necessary  for  this  method  that  occurs  at  the  beginning  of  execution  and  during 
the  case  when  o  must  be  reduced.  It  should  also  be  noted  that  partial  pricing  is  available 
with  this  algorithm,  if  the  user  so  desires. 

The  computational  results  are  presented  in  Tables  4. 3-4. 6.  The  column  for  “Infea- 
sibility”  iterations  refers  to  iterations  while  the  solution  was  infeasible.  The  column  for 
“Post-Infeasibility”  iterations  refers  to  iterations  that  occurred  after  a  feasible  solution  was 
found,  which  were  iterations  identical  to  Phase  II  of  the  two-phase  algorithm.  It  is  inter¬ 
esting  to  note  that  for  the  problem  BRANDY  without  scaling,  the  first  feasible  solution 
found  for  both  starting  bases  was  the  optimal  solution.  This  phenomena  did  not  occur  for 
this  problem  when  the  problem  was  scaled. 

Table  4.5  summarizes  the  effects  of  scaling  for  the  weighted  objective  algorithm.  The 
meaning  of  each  entry  is  the  same  as  in  Section  3.  The  choice  of  starting  basis  or  scaling 
seems  to  have  little  effect  on  this  algorithm. 
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Table  4.6  compares  the  weighted  objective  algorithm  to  the  two-phase  algorithm.  For 
each  scaling/starting-basis  pair,  the  two  tables  were  compared  separately  for  CPU  time 
and  iterations.  The  W/L  entry  indicates  the  number  of  wins  and  losses  for  the  weighted 
objective  algorithm  versus  the  two-phase  algorithm.  In  comparing  the  results,  there  were 
many  instances  where  the  number  of  iterations  was  unchanged,  indicating  that  the  choice 
of  initial  weight  may  be  too  small.  From  this  table,  we  can  see  that  the  weighted  objective 
algorithm  tends  to  decrease  the  number  of  pivots  necessary  to  reach  the  optimal  solution, 
but  does  not  always  reduce  the  CPU  time.  This  is  because  in  the  case  where  the  number 
of  iterations  on  a  particular  problem  is  equal,  the  CPU  time  for  the  weighted  objective 
algorithm  will  be  higher  due  to  the  initial  computation  of  ||c||. 
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Table  4.1  Weighted  Objective  Algorithm — Unsealed  with  UB  Basis 


Table  4.2  Weighted  Objective  Algorithm — Scaled  with  UB  Basis 
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Table  4.3  Weighted  Objective  Algorithm — Unsealed  with  SB  Basis 
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Table  4.4  Weighted  Objective  Algorithm — Scaled  with  SB  Basis 
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Section  5.  The  Markowitz  Criterion 


Section  S.  The  Maritawki  Criterion 


The  Markowitz  criterion  it  an  alternative  rule  for  selecting  the  incoming  variable  in 
Phaae  I  of  the  simplex  method.  In  this  section,  the  algorithm  and  its  implementation  are 
discussed  and  computational  results  are  presented. 

S.l  The  Markowitz  Criterion:  Algorithm 

The  criterion  informally  suggested  to  Dantzig  by  Harry  Markowitz  and  discussed  in 
Dantzig's  book  (1963)  is  a  different  rule  for  choosing  the  incoming  variable  zt  in  Phase 
I.  After  a  feasible  point  is  found,  the  usual  Phase  11  procedure  is  resumed.  The  reduced 
cost  d}  represents  the  decrease  in  the  infeasibility  form  w  =  dTx  per  unit  change  in  xr 
Similarly,  the  reduced  cost  c}  represents  the  decrease  in  the  objective  z  -  eTx  per  unit 
change  in  x}.  The  idea  behind  the  Markowitz  criterion  is  to  choose  x,  in  such  a  way  that 
there  is  a  maximum  decrease  of  the  objective  form  z  per  unit  decrease  of  the  infeasibility 
form  u>.  Mathematically,  s  is  chosen  so  that 


c. 

min  — *- 
i,  <o  - a t 


(5.1.1) 


The  algorithm  is  guaranteed  to  converge,  since  dt  <  0,  and  u>  will  decrease  at  every 
iteration,  assuming  some  lexicographic  device  for  resolving  degeneracy.  The  procedure  for 
this  algorithm  in  terms  of  the  simplex  method  is  as  follows: 

procedure  Id  arkovnts -Criterion, 
finished  —  false; 
while  not  finished  do 

factonze(B). 


begin 

if  needs -factoniaUon(B)  then 
FindPI:  Solve  xTB  =  cm  for  n, 


if 


infeasible  Jbasis(B)  then  begin 


FindPI  Solve  rTB  =  d,  for  ir; 

PriceOut  Find  s  such  that  i. 

has  mm  ratio  for  d,  <  0. 

optimal  —  d,  >0; 

end 

else  begin  _ _ _ 

PriceOut:  Find  s  such  that  z,  has  minimum  reduced  cost  r,. 


optimal  «—  c,  >  0: 

end 

if  not  optimal  then _ 

Represent  Solve  Bp  =  .4  /  ,  for  p. 


RatioTest  Find  r  such  that  in  blocks  the  change  m  i,  ! 
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if  not  unbounded  then 

begin 

q  —  Br. _ 

Update  Update  value  of  r; 

pivot(B,  s,  q,  r ), 

end, 

end; 

infeasible  *—  infeasible  Jxuu(  B)  and  optimal: 
finished  «—  optimal  or  unbounded  or  infeasible. 
end. 

There  are  some  interesting  properties  of  this  algorithm.  If  c,  >  0,  then  a  change  in  any 
nonbasic  variable  that  decreases  w  will  increase  z  At  such  a  point,  the  current  solution  to 
the  linear  program  is  most  likely  superoptimal.  If  c,  <  0,  then  changing  i,  will  decrease 
simultaneously  both  the  infeasibility  form  w  and  the  objective  form  z.  This  would  appear 
to  be  the  most  desired  choice  of  s  However,  it  appears  that  any  such  s  are  found  for  only 
the  first  few  iterations  of  the  method.  Thereafter,  c,  >  0.  This  observation  leads  one  to 
believe  that  the  Markowitz  criterion  has  the  feature  that  the  primal  and  dual  feasibility 
conditions  are  at  war  with  each  other  In  the  next  section,  the  computational  results  will 
verify  this  statement 

I  he  Markowitz  criterion  relates  in  an  interesting  way  to  the  weighted  objective 
method  The  weighted  objective  is 


z  =  dTi  +  ocTi 


(5.1.21 


1  his  objective  ran  be  multiplied  by  \  / a  so  that  the  weight  is  now  on  the  infeasibilitv  form. 
I  he  reduced  costs  for  this  new  objective  are  equal  to 


d  ■+  r 


(5.1.3) 


F«>r  any  nonbasic  i}  with  d}  <  0.  if 


(5.1.4) 


(5.1.5) 


M\  *  ornpanng  equation'.  1 5  1  It  and  (5.1  1).  one  can  see  that  the  Markowitz  criterion  is 
dvnamuallv  balancing  the  weight  on  the  infeasibilitv  form  so  that  r;  =  0  It  chooses 
the  minimal  weight  so  that  this  condition  is  true 
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S.2  Impttnwwtilion  and  Computational  Results 

I  he  Markowit2  criterion  requires  some  extra  work  per  iteration  when  the  basis  is  primal 
infeasible  In  Phase  I  of  the  two-phase  method.  only  the  prices  and  reduced  costs  for  the 
mfeasibihty  form  d1 1  are  conmuted  For  the  Markowitz  criterion,  the  prices  and  reduced 
costs  for  the  objective  form  c‘i  may  need  to  be  computed  as  well  I  here  are  two  possible 
methods  of  implementation  (where  tj  is  the  optimality  tolerance  as  defined  in  Section  1  i 
1  In  the  main  loop  that  computes  the  reduced  costs  d;  for  each  nonbasic  z;.  c;  is 
computed  as  well  The  values  of  the  elements  of  the  sparse  matrix  .4  are  each  retrieved 
once  from  memory 

For  each  nonbasic  ir  r.  is  computed  unit  when  dt  ■  tj  r  Heine  is  not  always 
computed  However,  when  r.  i>  <  c»mputecj,  similar  pricing  h»op  is  used  as  when 
computing  d,  Therefore,  each  element  of  the  sparse  matrix  .4  i*  retrieved  twice  from 
memory 

In  cut  her  case,  an  extra  call  to  the  procedure  FindPI  must  be  made  m  « •  r •  1  *•  r  to  compute  *■ 
It  would  seem  that  the  second  of  the  two  option'  is  more-  effu  lent  f  xpermientai  evidence 
indicates  that  the*  second  option  was  better  in  over  'hi  percent  of  the  computational  test' 
Hence,  the  results  reported  below  use  that  option 

in  order  to  use  equivalent  tolerance  criteria  onix  those  j  for  whi<b  •  t4  r  are 
used  when  evaluating  th*‘ expression  t  r>  i  !•  If  all  if,  then  the  current  solution 

is  considered  optimal  tor  the  1’hase  I  li n«*ar  program  ari'l  the  usual  Phase-  II  procedure  i' 
begun 

I  he-  computational  results  for  t fie*  Markowitz  ■  iiteric.n  are  presented  ,n  1  atiies  u  1  u  4 
I  he-  c  olurnn  for  “Infeasibilit y  iterations'  refers  to  tj,e  number  of  iterations  performed  when 
t  tic-  basis  was  primal  infeasible*  1  he  column  for  'Post  Intcasiboit  x  iterations  rc*fer'  to  the 
tiumbet  of  iterations  performecj  after  a  feasible  solution  wa*  found  I’  '  it.ter.-st  mg  to  note 
that  for  many  of  die  problems,  the-  first  feasible  soluticu.  found  ne  the  optimal  solution 
independent  of  the  starting  crash  basis  oi  of  scaling  I  hi'  mein  ate*'  that  the  Markowitz 
c  riterion  usually  finds  a  basis  that  is  supetupt unai  and  maintains  superoptimality  until  the 
optimal  basis  is  found  Most  of  the  problem*  require  few  Phase  II  iteration'  Hence,  the 
initial  feasible  point  found  by  the  Markowitz  .ntc-ri.cn  n  close  to  du  optimal  solution  in 
terms  of  Hamming  distance 

I  able  r>  5  summarizes  t  he  etfec  t»  of  s.  aling  for  tin  Mar  kow  it  z  .  riteimi.  I  !.«•  meaning 
of  each  entry  is  t he  same  as  in  Sec  t  ion  i  I  he  c  tnuc  e  of  st  rtr’  itig  Im*i'  or  '.  ai mg  seem'  to 

have  rio  overall  effect  on  thi'  algorithm  P  n  interesting  t<*  note  tha’  'fi»  SB  f  >«i>  Is  H  *i  s 

better  than  the  LTB  basis  w  hen  sc  aling  w<\-  u*ed  bu»  tf.a'  •  !.*•  resii.t  -  were  reversed  w  here 
sc  aling  was  n c > t  used 

1  able  ri  f»  compares  the  Markowitz  enter,  ..n  ♦ » »  t;,»  tx\..*  pfias*  a.g  '  lai.  for  ea<  fi 
sc  aimg /si  art  ing  basis  pair,  the  two  tables  were  •  ompate<:  opaia'c,,  <  Pf  tune  and 
iterations  Ifie  U  L  entry  unit*  ates  the  number  ■  »f  wit,-  and  .  r  tn»  Markowitz 

criterion  versus  the  two  phase  algorithm  lion,  ■  t . ;  -  I  ab.e  w.  al.  >•••  »  !.a’  *  !,*•?»  -eel!.' 

Ii»*  no  advantage  to  using  the  Markow  itz  ■  r  it  er ..  >;,  «*x.  ep*  *  ■:  a  :.-w 
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Table  5.3  Markowitz  Criterion — Unsealed  with  SB  Basis 
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Table  5  4  Markowitz  Criterion  Scaled  with  SB  Basis 
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Table  5.5  Markowitz  Criterion — Comparisons  of  Scaling  versus  Nonscaling 
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Table  5.6  Markowitz  Criterion — Comparison  to  Two-Phase  Algorithm 
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Section  6.  The  Self-Dual  Parametric  Algorithm 


In  this  section,  Dantzig’s  self-dual  parametric  algorithm  is  discussed.  This  algorithm 
has  been  shown  (Lustig,  1987)  to  be  equivalent  to  Lemke’s  algorithm  for  solving  linear 
complementarity  problems  when  applied  to  linear  programs  as  a  special  case.  The  imple¬ 
mentation  of  the  self-dual  algorithm  is  described  and  computational  results  are  presented. 

6.1  Dantzig’s  Self-Dual  Parametric  Algorithm 

Dantzig  (1963)  describes  the  self-dual  parametric  algorithm  as  a  method  which  does  not 
require  an  initial  primal  or  dual  feasible  solution,  but  which  uses  the  concepts  of  the 
primal  and  dual  simplex  algorithms  extensively.  Both  the  weighted  objective  method  and 
the  Markowitz  criterion  are  alternative  rules  for  choosing  the  incoming  variable  s  in  Phase 
I  of  the  simplex  method.  After  3  is  chosen,  the  outgoing  variable  r  is  chosen  by  the  normal 
ratio  test.  In  contrast,  the  self-dual  algorithm  changes  the  order  of  choosing  the  incoming 
variable  s  and  the  outgoing  variable  r  on  each  iteration  depending  on  the  current  basic 
solution  to  the  problem.  Ravindran  (1970)  gave  an  equivalent  statement  of  the  algorithm 
by  modifying  Lemke’s  method  to  modify  only  the  primal  simplex  tableau. 

It  is  easier  to  understand  the  method  using  some  of  his  notation.  The  discussion  of 
the  algorithm  is  simpler  when  applied  to  the  linear  program 


minimize  c  x 

subject  to  Ax  >  b,  (6.1.1) 

x  >  0. 

A  parameter  9  is  used  in  this  method  and  the  linear  program  (6.1.1)  is  transformed  to 

(6.1.2) 

where  u  6  is  a  vector  of  slack  variables  which  form  an  initial  basis, 


minimize  (c  -f  0d)Tx 
subject  to  —  Ax  +  Iu  =  —b  +  8f, 

x,  u  >  0 


and 


J  1,  if  bi  >  0; 
U  “  \  0,  if  6.  <  0  ’ 


,  _  /  1,  if  Cj  <  0; 

;  ~  1  0,  if  Cj  >  0 


for  i  =  1, . . .  ,m, 


for  j  =  1, . . .  ,n. 


(6.1.3) 


(6.1.4) 


If  6  —  -l-oo,  then  the  basis  uj,  u2,  . . . ,  um  is  both  primal  and  dual  feasible  for  the  parametric 
linear  program  (6.1.2).  There  exists  a  value 


0o  = 


(6.1.5) 


r*4  cl#(  i1. 
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and  an  t  >  0  such  that  for  8  =  0q  —  e,  the  current  basic  solution  to  the  parametric  linear 
program  (6.1.2)  is  infeasible  for  only  one  primal  variable  or  one  dual  variable  (assuming 
no  ties  exist  when  equation  (6.1.5)  is  evaluated — this  occurs  when  degeneracy  exists  in 
the  parametric  linear  program).  If  8  =  8o,  then  the  parametric  linear  program  has  either 
multiple  primal  or  multiple  dual  solutions.  If  a  reduced  cost  Cj  +  8o dj  for  a  nonbasic  Xj 
was  zero,  one  could  set  s  =  j  and  attempt  to  make  xt  basic,  using  the  primal  simplex  ratio 
test 

.  f  - b ,  -f  ft80  ^  n)  ,c  , 

r  =  argmin  <  - ,  —  atJ  >  0  >  (6.1.6) 

»=l,...,m  (  — ^ij 


to  determine  r.  If  a  primal  basic  variable  u,  =  —  b,  +  Bofi  was  zero,  then  one  could  set 
r  =  i  and  use  the  dual  simplex  ratio  test 


f  c,  +  d,0o  \ 

=  arg  min  <  — - - — ,  —  arj  <  0  > 

3-  1 *>  l  arj  ) 


(6.1.7) 


to  determine  s.  After  exchanging  x,  and  ur  in  the  basis,  the  new  basic  solution  is  optimal 
for  9  =  8o-  Both  /  and  d  must  be  updated  to  values  /  and  d  in  the  same  way  as  —6  and  c 
are  updated  to  values  b  and  c.  Dantzig  (1963)  and  Ravindran  (1970)  give  different  proofs 
showing  that  8  can  now  be  reduced  to  a  value  below  0O  while  maintaining  primal  and  dual 
feasibility.  The  system  is  now  in  the  same  form  as  the  parametric  linear  program  (6.1.2) 
bv  a  relabeling  of  the  variables.  The  process  can  be  continued  until  8  —  0.  At  this  point,  a 
primal  and  dual  feasible  solution  to  the  original  linear  program  (6.1.1)  has  been  obtained. 
If  the  ratio  test  performed  in  (6.1.6)  has  no  — a„  >  0  or  the  ratio  test  performed  in  (6.1.7) 
has  no  -ar]  <  0,  then  the  original  linear  program  (6.1.1)  is  primal  or  dual  infeasible  (or 
both). 

In  the  framework  of  the  revised  simplex  method,  the  algorithm  is  as  follows: 

procedure  Sclf-Dual{f,d)\ 

finished  *—  false; 

8  < - hoc; 

while  not  finished  do  begin 
optimal  *—  false; 

if  needs  jactorization(B)  then  factorize(B): 

RatioTest:  Find  s  or  r  such  that  xBr  or  c ,  blocks  the  change  in  8 
if  blocks(cs)  then  begin 

Represent:  Solve  Bp  =  [A  /].,  for  p; 

RatioTest:  Find  r  such  that  xBr  +  8fr  blocks  the  change  in  xs\ 

unbounded  ^infeasible  *—  r  =  0; 

if  not  unbounded  .infeasible  then  begin 

FindPI:  Solve  -)TP  =  er  for  -y; 

PriceOut:  Compute  ■>  =  .4r.  —  7 [.4  /]; 


end 

end 

else  if  blockslx, 


then  begin 


s  \  \  V 


vy 
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FindPI:  Solve  7 TB  =  er  for  7; 

PriceOut:  Compute  7  =  Ar.  —  7 [A  J]; 

RatioTest:  Find  s  such  that  c,  -1-  8d ,  blocks  the  change  in  7rr; 

unbounded  Jnfeosible  «—  s  =  0; 
if  not  unbounded  .infeasible  then 


Represent:  Solve  Bp  —  [A  I\-,  for  p; 

end 

finished  <—  optimal  or  unbounded -infeasible ; 

if  not  finished  then  begin 

q  *-  Br; _ 

Update:  Update  values  of  x,c,d,/; 

pivot(B,s,q,r); 

end; 

end; 

There  are  some  things  to  note  about  this  form  of  the  algorithm.  The  variable 
unbounded  Jnfeosible  indicates  that  the  original  linear  program  has  either  an  unbounded 
objective  function  or  no  feasible  solution.  If  this  variable  is  set  to  true,  then  one  can  use 
the  normal  two-phase  simplex  method  to  determine  the  infeasibility  or  unboundedness  of 
the  problem.  The  vector  7  is  needed  to  update  c  and  d.  In  the  normal  revised  simplex 
method,  c  is  recomputed  each  iteration  by  pricing  out.  In  the  revised  simplex  form  of  the 
self-dual  method,  it  is  more  efficient  to  compute  7  by  a  pricing  operation  and  then  update 
c  and  d  using  the  formulas 


c  *—  c  —  617  and  d  *— d  —  6? 7,  (6.1.8) 

where 

Sx  =  —  and  62  =  — •  (6-1.9) 

7*  7* 

This  method  eliminates  one  full  call  to  the  routine  PriceOut. 

It  should  be  noted  that  the  routine  name  Self-Dual(f,d)  has  /  and  d  as  parameters. 
This  is  because  the  initial  choice  of  /  and  d  must  satisfy  — 6  +  0/  >  0  and  c  +  8d  >  0  for 
some  8  >  0.  Hence  the  initial  values  of  /  and  d  defined  in  equations  (6.1.3)  and  (6.1.4)  are 
specific  instances  of  the  parameters  /  and  d,  respectively.  In  Section  7,  a  variant  of  the 
self-dual  method  that  uses  a  different  choice  for  the  initial  values  of  /  and  d  is  discussed. 
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6.2  Implementation  and  Computational  Results 

As  compared  to  the  two-phase  simplex  method,  the  self-dual  algorithm  must  perform 
some  extra  computational  work  per  iteration.  This  extra  work  can  be  separated  into  two 
categories — extra  ratio  tests  and  extra  work  in  the  procedure  Update.  In  the  two-phase 
method,  there  is  one  ratio  test  for  the  m  primal  basic  variables  on  every  iteration.  In  the 
self-dual  algorithm,  a  ratio  test  is  used  to  compute  the  new  value  of  9 ,  by  testing  the  m 
primal  basic  variables  and  the  n  reduced  costs  for  the  nonbasic  variables.  If  a  dual  variable 
blocks  the  decrease  of  9 ,  the  primal  simplex  algorithm  is  used  and  another  ratio  test  for 
the  m  primal  variables  is  used.  If  a  primal  variable  blocks  the  decrease  of  0,  then  the  dual 
simplex  algorithm  is  used  and  another  ratio  test  for  the  n  reduced  costs  is  used.  Therefore, 
on  each  iteration,  either  2m  4-  n  or  m  -f  2n  variables  are  tested  in  various  ratio  tests. 

On  each  iteration  in  the  two-phase  simplex  method,  only  the  values  of  the  m  primal 
basic  variables  must  be  updated.  In  the  self-dual  method,  the  values  of  c,  d  and  /  must 
be  updated  as  well.  Hence  2n  +  m  extra  variables  must  have  their  values  updated  besides 
the  m  components  of  x. 

For  theoretical  convergence  of  the  self-dual  method,  9  should  decrease  on  each  itera¬ 
tion.  However,  due  to  degeneracies  in  the  parametric  linear  program  (6.1.2),  9  may  remain 
unchanged.  Because  of  round-off  errors,  9  may  actually  increase  by  an  extremely  small 
amount.  Hence,  on  each  iteration,  a  tentative  value  of  9  =  9  is  computed  via  the  ratio 
test.  Then  9  is  computed  by  taking  the  minimum  of  the  previous  value  of  9  and  9,  i.e., 


min(0, 9). 


(6.2.1) 


Other  round-off  errors  may  cause  the  values  x  +  9f  and  c  +  6d  to  violate  their  feasibility 
constraints  by  more  than  the  primal  and  dual  feasibility  tolerances,  respectively.  When 
MINOS  checks  to  see  if  the  current  solution  is  too  infeasible  (in  accordance  with  the  CHECK 
FREQUENCY  option),  the  current  parametric  solution  is  checked  as  well.  If  a  component 
of  the  parametric  solution  violates  its  respective  feasibility  tolerance,  then  the  respective 
component  of  /  or  d  is  reset  so  that  the  current  parametric  solution  is  feasible.  This  allows 
the  algorithm  to  continue  from  the  current  solution  without  increasing  the  value  of  9. 

In  the  above  discussion  of  the  self-dual  method,  it  was  assumed  that  the  linear  program 
had  every  variable  bounded  below  bv  zero  and  that  all  of  the  linear  constraints  were  of  the 
same  type.  The  method  was  started  from  the  basic  variable  set  consisting  of  the  m  slack 
variables.  In  the  context  of  MINOS,  the  self-dual  method  is  started  from  a  crash  basis  B. 
and  each  variable  may  have  a  finite  lower  or  upper  bound,  or  both.  This  affects  the  sign 
of  the  nonzeros  of  /.  Given  this  initial  basis,  the  nonbasic  variables  are  represented  by  set 
A  such  that  .V  =  ,4A-.  Each  of  the  nonbasic  variables  xv  is  set  equal  to  one  of  its  bounds. 
I  lie  equations  for  the  linear  program  can  be  rewritten  as 


Big  4-  A’x.v  =  0, 
l  <  x  <  u. 


(6.2.2a) 

(6.2.26) 


Since  B  is  nonsingular.  r(  can  be  computed  by  solving  equation  (6.2.2a).  Now  /,  is  chosen 
according  to  whether  xPi  violates  its  upper  or  lower  bound,  i.e. 


1  if  ij  <  I,,  j  =  £*,; 

-  1  if  Sj  >  u j.  j  =  L\ 
0  otherwise. 


(6.2.3) 
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Then  lB  <  xm  +  8f  <  uB  for  all  sufficiently  large  6  if  each  of  the  variables  in  the  initial 
crash  basis  has  exactly  one  finite  bound. 

If  a  variable  has  finite  lower  and  upper  bounds,  then  the  parametric  value  xB  +  6f 
will  satisfy  one  bound  for  large  values  of  6,  but  not  the  other.  It  turns  out  that  the  most 
difficult  computational  issue  in  the  self-dual  method  is  the  manipulation  of  primal  basic 
variables  that  have  finite  lower  and  upper  bounds.  Suppose  x\  is  such  a  variable  in  the 
initial  crash  basis  and  that  B\  =  1.  If  x\  <  li  for  the  initial  basic  solution,  then  the 
parameterized  value  x\  +  6f\  >  for  large  6.  After  a  pivot  in  the  self-dual  method,  the 
value  of  xj  may  exceed  l\  in  such  a  way  that  Xi  +  9}\  >  h  for  all  6  >  0.  Furthermore, 
x\  +  9f\  >  ui  for  the  current  value  of  6.  In  order  for  the  self-dual  method  to  converge,  it 
is  necessary  to  treat  such  conditions  specially. 

In  the  theoretical  version  of  the  self-dual  method,  each  component  of  x  has  a  lower 
bound  of  0  and  no  upper  bound.  If  a  variable  x\  has  an  upper  bound  ui ,  then  the  constraint 

Xi  +  s  =  t*i  (6.2.4) 

is  added  to  the  problem,  where  s  is  the  slack  variable  for  the  upper  bound  constraint.  In 
the  context  of  MINOS,  it  is  necessary  to  compute  the  value  of  s  implicitly.  If  such  a  row 
(6.2.4)  were  added  to  the  MINOS  constraint  set,  then  the  variable  s  would  be  in  the  initial 
basis.  Furthermore,  the  initial  basic  solution  would  satisfy 


Similarly,  the  initial  value  of  /  satisfies 

(.  .*■ . :)(:)-(/.) 

for  some  /  (where  f\  is  the  initial  value  of  /  corresponding  to 
value  of  /  corresponding  to  s  on  later  iterations  and  xi  remains 

(,  .*■ .  :)(/)-«)■ 

Hence,  the  value  of  f,  can  be  computed  as 

=  (6-2.8) 


(6.2.6) 

ij).  If  /,  represents  the 
basic,  then 

(6.2.7) 


Therefore,  the  parametric  value  of  s  is 


s(0)  =  «i  -  xi  +  0(/i  -  /i). 


(6.2.9) 


In  a  primal  simplex  ratio  test,  either  X]  +f\B  is  tested  or  s(0)  is,  depending  upon  whether  Tj 
is  increasing  or  decreasing  as  a  function  of  the  incoming  basic  variable.  This  computation  is 
simple,  as  long  as  the  initial  values  of  /  are  saved  throughout  the  execution  of  the  method. 
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This  argument  also  applies  to  the  first  ratio  test  done  on  each  iteration  to  find  the 
variable  that  blocks  the  decrease  of  6.  If  s(0)  blocks  the  decrease  of  9,  then  ij  exits  the 
basis  and  becomes  nonbasic  at  its  upper  bound. 

It  should  be  noted  that  this  problem  does  not  occur  with  the  parametric  reduced 
costs,  since  the  dual  variables  of  the  linear  program  in  MINOS  standard  form  have  at  least 
one  infinite  bound  (ignoring  the  rare  case  of  nonbasic  free  variables). 

The  computational  results  for  the  self-dual  method  are  presented  in  Tables  6. 2. 1-6. 2. 4. 
The  columns  for  primal  and  dual  iterations  correspond  to  the  number  of  primal  simplex 
and  dual  simplex  iterations,  respectively,  done  by  the  self-dual  algorithm. 

Table  6.2.5  compares  the  effects  of  scaling  and  the  different  crash  bases  on  the  self-dual 
algorithm.  The  meaning  of  each  entry  is  the  same  as  in  Section  3.  Table  6.2.6  compares 
the  self-dual  algorithm  to  the  two-phase  algorithm  for  iterations  and  CPU  time.  The  entry 
U*  —  L  indicates  the  number  of  times  that  the  self-dual  algorithm  had  less  iterations  (CPU 
time)  than  the  two-phase  algorithm.  The  self-dual  algorithm  seems  to  reduce  the  number 
of  iterations  on  some  problems,  but  the  extra  work  per  iteration  increases  the  CPU  time 
enough  so  that  the  self-dual  algorithm  is  faster  than  the  two-phase  algorithm  for  only  a 
feu-  problems. 
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Table  6.2.1  Self-Dual  Algorithm — Unsealed  with  UB  Basis 


Problem  Name 
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(Seconds) 

2 

6 

■■jKM 

ADLITTLE 

33 

39 

72 

8.57 

SHARE2B 

67 

79 

146 

fic]K 

185 

313 

BEACONFD 

31 

17 

48 

ISRAEL 

177 

237 

MmSm 

HTWi 

wmmjjm 

■KWM 

161.91 

E226 

■ 

205 

500 

166.47 

CAPRI 

mmm 

159 

204 

57.81 

BANDM 

■KM 

■mm 

STAIR 

j: 

ETAMACRO 

mBKSSm 

Table  6.2.2  Self-Dual  Algorithm — Scaled  with  UB  Basis 


Section  6.2 


Implementation  and  Computational  Results 


Table  6.2.3  Self-Dual  Algorithm — Unsealed  with  SB  Basis 
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Table  6.2.4  Self-Dual  Algorithm — Scaled  with  SB  Basis 
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Table  6.2.5  Self- Dual  Algorithm — Comparison  of  Scaling  versus  Nonscaling 
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Table  6.2.6  Self-Dual  Algorithm— Comparison  to  Two-Phase  Algorithm 
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in  this  section,  two  variants  of  the  self-dual  parametric  algorithm  are  discussed. 
Both  variants  are  motivated  by  considering  the  equivalence  of  the  self-dual  algorithm  and 
Lemke’s  method,  shown  by  Lustig  (1987).  The  first  method  is  equivalent  to  choosing  a 
different  covering  vector  for  Lemke's  algorithm.  The  second  method  tries  to  account  for 
the  prevalence  of  fixed  variables  in  the  initial  crash  basis. 

7.1  Normalized  Covering  Vectors 

When  Lemke's  algorithm  is  applied  to  the  linear  program 


minimize  c  x 


subject  to 


Ax  >  6, 

i  >  0, 


an  extra  variable  9  is  adjoined  to  form  the  set  of  equations 


uTy  -I-  vTi  =  0, 
u,  v,  x,  y  >  0. 


(7.1.  i; 


(7.1.2a) 

(7.1.26) 

(7.1.2c) 


For  the  self  dual  algorithm  to  converge,  the  initial  values  of  /  and  d  must  be  chosen  so 
that  the  parametric  primal  basic  variables  and  parametric  reduced  costs  are  nonnegative. 
Typically,  /  and  d  are  chosen  so  that 


(  /  1,  if  6,  >  0;  , 

'  ~  (  0,  if  4,  <0  '  for 


i  =  1, .  . .  ,m, 


(7.1.3) 


f  1,  if  Cj  <  0; 
~  \  0,  if  Cj  >  0  ' 


for  >  = 


(7.1.4) 


In  a  study  of  the  general  linear  complementarity  problem.  Eaves  (1971)  suggested  varying 
the  choice  of  the  initial  values  of  /  and  d  when  using  Lemke’s  algorithm.  The  same  sugges¬ 
tion,  of  course,  is  applicable  to  the  self-dual  algorithm.  Krueger  (1986)  also  investigated 
the  effects  of  varying  the  initial  values  of  /  and  d.  For  the  self-dual  algorithm,  /  and  d 
can  be  chosen  so  that  the  various  ratio  tests  remain  unaffected  if  the  rows  or  columns  of 
the  original  linear  program  are  rescaled.  If  the  original  problem  is  not  scaled  first,  this  will 
only  be  in  the  sense  that  the  initial  value  of  9  is  unit-free ,  i.e.,  a  change  in  units  of  any 
primal  or  dual  variable  will  not  change  the  initial  value  of  6  with  respect  to  this  covering 
vector.  If  the  original  problem  has  been  scaled  first  and  then  /  and  d  chosen  so  that  9  is 
to  be  initially  scale-free,  it  will  be  so  for  all  subsequent  changes  in  9. 

Because  an  initial  crash  basis  can  contain  both  primal  decision  variables  j;,  j  = 
1 . n.  and  slack  variables  u„  t  =  1 . rn.  it  is  important  to  treat  each  of  them  separately 
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when  developing  a  unit-free  covering  vector.  Furthermore,  the  value  d}  for  the  reduced 
cost  Cj  will  depend  upon  whether  its  complementary  variable  was  a  decision  variable  or  a 
slack  variable. 

Let  A  =  (— A  J],  cT=  (eT0)  €  &m+B,  and  relabel  the  m  slack  variables  u,  to  be  xm+, 
so  that  the  linear  program  is  now 


minimize  cTx 
subject  to  Ax  =  —6, 
x  >  0. 


(7.1.5) 


Let  the  initial  crash  basis  be  represented  by  an  index  set  B.  Suppose  h  decision  variables 
Xjk  ,  1  <  4  <  h,  1  <  jk  <  n,  and  g  slack  variables  Xjk ,  h  -f-  1  <  it  <  m,  n  +  1  <  jk  <  n  -f  m, 
(m  =  g  -f  h)  are  the  initial  basic  variables  and  Bt  =  jk  so  that  the  first  h  basic  variables 

are  decision  variables.  Let  M  index  the  n  nonbasic  variables.  B  =  AB  is  nonsingular,  and 
therefore  the  linear  program  (7.1.5)  can  be  rewritten 


minimize 
subject  to 


(ct—ctbB  lA)x 

IxB  +  B~lAj,/Xtr  =  —B~1b, 
x  >  0. 


(7.1.6) 


Writing  x  =  xv,  cT  =  (cTv  —  ctbB  ^v),  A  =  —B  lAAr,  and  b  =  B~*b,  this  linear 
program  can  be  rewritten  as 

minimize  cTx 

subject  to  Ax  >  b,  (7.1.7) 

x  >  0. 

Lemke’s  method  is  initialized  for  (7.1.7)  by  writing 


(fM0*+(-V  o)(I)  =  (o)- 

(7.1.8a) 

uTy  +  vTx  =  0, 
u,v,x,y  >  0, 

(7.1.86) 

(7.1.8c) 

where  y  =  (y  u)B, 
(7.1.8)  is 


B  = 


u  =  xB,  and  v  =  (y  v)^.  The  initial  ratio  test  in  Lemke’s  method  for 


(7.1.9) 


If  a  row  or  column  of  the  original  linear  program  (7.1.1)  is  multiplied  by  a  scalar  cr,  then 
the  ratio  test  (7.1.9)  will  be  unit- free  if  /  and  d  are  chosen  so  that  the  blocking  variable 
in  (7.1.9)  is  independent  of  the  scaling.  Hence,  /  and  d  will  depend  on  the  original  values 
of  A,  b ,  and  c. 
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First,  consider  when  a  column  of  the  original  problem  (7.1.1)  is  scaled  by  a  value  a. 
Let  X\  be  a  variable  in  the  initial  crash  basis.  Then  the  initial  value  of  x\  =  —  b\.  Suppose 
that  —6]  <  0  so  that  xj  is  initially  infeasible.  If  the  column 


( 


Cl 

A- 1 


) 


is  multiplied  by  a  value  <r,  so  that 


(7.1.10) 


(7.1.11) 


then  the  value  of  xi  in  the  scaled  problem  is  x'j  =  —b\/o.  We  desire  that  the  ratio 


fi 


be  unaffected  by  this  scaling.  If 


(7.1.12) 


/■  =  iik  at  nr. 


(7. 1.13) 


then 


i 


(7.1.14) 


Hence,  the  ratio  (7.1.12)  is  not  affected  by  any  scaling  of  the  column  for  xj. 

Suppose  xj  is  initially  nonbasic  at  its  lower  bound,  and  the  initial  value  of  the  dual 
variable  Oj  =  Cj  <  0.  If  a  scaling  as  in  (7.1.11)  is  done,  then  the  initial  value  of  t>i  is 
u'j  =  oc\.  In  the  initial  ratio  test  for  Lemke’s  method,  the  ratio 


£i 

di 


is  computed.  If 


then  the  ratio  (7.1.15)  is 


dx 


<*.-11(4  AT  HI, 

<7C\  _  C\ 

<*\\  (  Cl  A?J)||  H  (c,  )  ]| ' 


(7.1.15) 

(7.1.16) 

(7.1.17) 


Hence,  the  ratio  (7.1.15)  is  unaffected  by  any  scaling  of  the  column  for  xj. 

When  a  row  of  the  original  problem  is  scaled,  the  values  of  the  slack  variables  and 
their  reduced  costs  are  affected.  Suppose  the  first  row  (  /Ij.  bi  )  is  scaled  by  a  value  a  so 
that 


(A',.  &i)  =  <rM,.  6,). 


(7.1.18) 
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Suppose  U]  =  kj  —  A\.x  is  in  the  initial  crash  basis  and  that  uj  =  x*+i  <  0.  If  the  above 
scaling  is  done,  then  u',  =  au\.  We  desire  that  the  ratio  test 


is  independent  of  any  scaling  on  the  first  row.  If 


A+i  =  IIMi.  *S)H, 


then  the  ratio  (7.1.19)  is 


/*+ 1  *11  Mi-  Mil  II  Mi-  M 


(7.1.19) 


(7.1.20) 


(7.1.21) 


Hence,  this  choice  of  /*+ 1  causes  the  ratio  (7.1.19)  to  be  unit-free. 

Now  suppose  that  uj  is  initially  nonbasic  at  a  value  of  =0  and  that  its  correspond¬ 
ing  dual  variable  y\  is  negative,  j/j  =  ck+ j  =  —caB~le j,  where  ex  is  the  first  column  of 
the  identity  matrix.  When  the  first  row  of  A  is  scaled  as  in  equation  (7.1.18),  the  first 
column  of  B~l  is  divided  by  a.  Hence,  a  scaling  causes  y\  =  yi/er.  We  desire  that  the 
ratio 

(7.1.22) 

d  i 


be  independent  of  scaling.  If  we  set 


dr  =  \\(A\.  Mil'1, 


(7.1.23) 


then  the  ratio  (7.1.22)  is 


j*»+l 


d>  ill  Ml-  b]  )  II"1  IK*!-  Mil"1 


(7.1.24; 


which  is  seen  to  be  independent  of  any  scaling  of  the  first  row. 

In  summary,  the  value  of  /,  will  depend  on  whether  the  i,h  basic  variable  is  a  decision 
variable  x;,  1  <  ;  <  n,  or  a  slack  variable  x},  n  +  1  <  j  <  m  +  n.  This  distinction  is 
determined  by  the  value  of  B,  =  j.  Hence,  if  xB<  <  0, 


f  =  /  II  ( 

’  l  II  (  Ak- 


AT})\\-1  if>=B„  l<;<n 
6*  )  ||  if  k  =  Bt  —  n,  1  <  k  <  m 


(7.1.25) 


(/,  =  0  if  is  feasible).  The  value  of  d}  is  determined  for  c}  <  0  by 


(C}  AT})\\  ifl  <J<n 

(Ak.  6t)||-‘  \(k  =  j-n,  1  <k< 


(7.1.26) 
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(dj  =  0  if  c}  >  0).  These  values  can  be  computed  prior  to  the  first  iteration  of  the  self-dual 
algorithm. 

The  computational  results  for  the  self-dual  method  with  a  normalized  covering  vector 
are  presented  in  Tables  7. 1.1-7. 1.4.  The  columns  for  primal  and  dual  iterations  correspond 
to  the  number  of  primal  simplex  and  dual  simplex  iterations,  respectively,  done  by  the  self¬ 
dual  algorithm  using  that  covering  vector. 

Table  7.1.5  summarizes  the  effects  of  scaling  for  the  self-dual  algorithm  with  a  nor¬ 
malized  covering  vector.  It  seems  that  the  choice  of  scaling  or  the  starting  basis  has  little 
effect  on  this  variant. 

Table  7.1.6  compares  the  self-dual  algorithm  with  a  normalized  covering  vector  (NCV) 
to  the  two-phase  algorithm.  It  seems  that  this  variant  offers  an  improvement  over  the  two- 
phase  procedure  for  only  a  few  of  the  problems.  The  CPU  time  improvements  occurred 
only  when  the  problem  was  unsealed  and  started  from  the  UB  basis. 

Table  7.1.7  compares  the  effects  of  using  the  normalized  covering  vector  versus  using 
the  default  vector  in  the  self-dual  algorithm.  The  results  are  mixed,  and  there  does  not 
seem  to  be  a  clear  advantage  to  using  either  covering  vector.  It  is  interesting  to  note 
the  results  for  when  the  problems  were  scaled  and  started  from  the  SB  basis.  Here,  the 
iteration  counts  were  higher  when  using  the  normalized  covering  vector,  but  the  CPU  time 
was  less  in  three  of  the  cases.  This  is  because  for  those  three  problems,  less  dual  simplex 
iterations  were  done  when  the  normalized  covering  vector  was  used,  while  the  total  number 
of  iterations  increased.  This  indicates  the  sensitivity  of  the  self-dual  algorithms  (and  its 
variants)  to  fluctuations  in  the  number  of  primal  and  dual  simplex  iterations  and  their 
effects  on  the  total  CPU  time. 
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Table  7.1.1  Self-Dual  Algorithm  Normalized  Variant — Unsealed  with  UB  Basis 


Table  7.1.2  Self-Dual  Algorithm  Normalized  Variant — Scaled  with  UB  Basis 
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Problem  Name 

Primal 

Dual 

Total 

CPU  Time 

Iterations 

Iterations 

Iterations 

(Seconds) 

AFIRO 

4 

2 

6 

0.65 

ADLITTLE 

51 

66 

117 

13.88 

SHARE2B 

53 

82 

135 

19.23 

Shareib 

ll2 

172 

284 

~  $4.16 

BEACONED 

33 

6 

39 

10.05 

ISRAEL 

209 

122 

331 

82.38 

BRANtoV  "1 

114 

2l5 

329 

98.42 

E226 

324 

195 

519 

171.33 

CAPRI 

57 

195 

252 

71.79 

BANDM 

262 

255” 

517 

26l.80 

STAIR 

307 

217 

524 

298.66 

ETAMACRO 

437 

228 

665 

289.38 

Table  7.1.3  Self-Dual  Algorithm  Normalized  Variant — Unsealed  with  SB  Basis 


Problem  Name 

Primal 

Dual 

Total 

CPU  Time 

Iterations 

Iterations 

Iterations 

(Seconds) 

AFIRO 

4 

2 

6 

—  0.62 

ADLITTLE 

58 

54 

112 

13.65 

SHARE2B 

70 

101 

171 

24.56 

Shareib 

h— '  132 

154 

286 

”  55.40 

BEACONED 

36 

43 

79 

20.24 

ISRAEL 

175 

100 

275 

66.91 

BRANDY 

120 

“ “  226 

345“ 

99.10 

E226 

280 

176 

456 

145.00 

Chi>R\ 

45 

181 

226 

63.37 

BANDM 

278 

261 

539 

203.64 

STAIR 

184 

186 

370 

220.98 

ETAMACRO 

444 

356 

800 

352.97 

Table  7.1.4  Self-Dual  Algorithm  Normalized  Variant — Scaled  with  SB  Basis 
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atm irmmm 

- Ob - 

SB 

Scaling  Status 

Unsealed 

Scaled 

Unsealed 

Scaled  | 

UB 

4W-^L 

mmm 

iri'i  rnrri 

tarsia 

SB 

Unsealed 

¥A\'£\n 

6W-5L 

wmmim 

mw'mm 

Table  7.1.5  Self-Dual  Algorithm  Normalized  Variant — 
Comparison  of  Scaling  versus  Nonscaling 


1  Starting  Basis 

UB 

—  §B  — j 

Scaling  S 

tatus 

Unsealed 

Scaled 

Unsealed 

Scaled 

Self-Dual  (NCV) 

CPU 

3W-9L 

ow-i2l 

0W-12L 

0W-12L 

versus  Two-Phase 

Iterations 

3W-gL 

"  2w-$l 

4W-7L 

5W-6L 

Table  7.1.6  Self-Dual  Algorithm  Normalized  Variant — 
Comparison  to  Two-Phase  Algorithm 


Starting  Basis 

UB 

SB  | 

Scaling  Status 

Unsealed 

Scaled 

Unsealed 

Scaled 

Self-Dual  (NCV)  1  CpO  ~ 

5W-7L  1 

6w-6l  ! 

6W-5L 

7W-4L 

versus  Self-Dual  |  Iterations 

5W-6L 

4W-7L 

6W-5L 

4W-7L 

Table  7.1.7  Self-Dual  Algorithm  Normalized  Variant — 
Comparison  to  Self-Dual  Algorithm 


7.2  Fixed  Variables 

Some  general  linear  programs  have  constraints  of  the  form 

ajr  =  b,.  (7.2.1) 

Wheii  such  a  linear  program  is  converted  to  MINOS  standard  form,  the  slack  variable  xn+, 
corresponding  to  this  constraint  will  have  equal  lower  and  upper  bounds.  Any  variable  x} 
with  equal  lower  and  upper  bounds  is  known  as  a  fixed  variable.  In  the  dual  of  the  linear 
program,  the  complement  of  a  fixed  variable  is  free ;  in  other  words,  the  complementary 
variable  has  infinite  lower  and  upper  bounds.  Hence,  once  a  fixed  variable  becomes  nonbasic 
in  the  course  of  executing  any  variant  of  the  simplex  method,  that  variable  will  never 
become  basic  at  a  later  iteration. 

Fixed  variables  can  appear  in  the  initial  crash  basis,  because  they  must  be  included 
in  order  to  maintain  the  lower  triangularity  of  the  crash  basis.  If  the  fixed  variable  is  not 
at  its  bound,  then  Phase  I  will  consider  the  fixed  variable  to  be  infeasible  and  include  the 
variable  in  the  Phase  I  objective.  Eventually,  this  will  cause  the  fixed  variable  to  attain 
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its  bound.  At  that  point,  the  variable  will  exit  the  basis  if  the  equation  (7.2.1)  is  not 
redundant  given  the  other  linear  equations  in  the  linear  program.  There  seems  to  be  an 
intuitive  advantage  to  having  as  few  fixed  variables  in  the  set  of  basic  variables  as  possible. 
It  is  interesting  to  note  that  the  two-phase  simplex  method  causes  a  fixed  variable  to  exit 
the  basis  by  choosing  incoming  columns  before  choosing  exiting  columns  in  each  iteration. 

The  self-dual  algorithm  has  the  property  that  it  can  choose  exiting  columns  before 
incoming  columns.  The  idea  behind  this  variant  of  the  self-dual  algorithm  is  to  cause  as 
many  fixed  variables  as  possible  to  exit  the  basis  before  proceeding  with  the  usual  self-dual 
method.  Because  of  the  freedom  allowed  in  the  initial  choice  of  /  and  d,  /  can  be  chosen 
so  that  some  fixed  variable  “wins”  the  initial  ratio  test  in  the  self-dual  algorithm.  The 
dual-simplex  method  is  invoked  and  that  fixed  variable  exits  the  set  of  basic  variables, 
never  to  reenter.  The  method  can  then  be  restarted  by  choosing  a  new  /  and  d  so  that 
another  fixed  variable  will  now  exit  the  basis.  If  all  basic  fixed  variables  are  at  their  bounds 
or  are  nonbasic,  then  the  usual  self-dual  algorithm  is  invoked.  The  algorithm  will  converge 
since  either  a  fixed  variable  exits  the  basis  on  an  iteration  or  6  is  reduced. 

The  initial  values  of  /  and  d  are  chosen  in  the  default  way  and  the  new  value  of  8  is 
computed.  The  fixed  variable  that  is  furthest  from  its  bound  is  then  chosen  as  the  variable 
to  become  nonbasic.  For  example,  suppose  x }  is  the  ith  basic  variable  and  is  designated 
as  the  most  infeasible  fixed  variable,  and  that  the  current  value  of  x}  >  l}.  Then  /,  is 
modified  so  that 


(7.2.2) 


which  would  force  x}  to  be  the  exiting  variable  if  the  initial  ratio  test  in  the  self-dual 
algorithm  were  recomputed.  After  x}  exits  the  basis,  /  and  d  are  reset  to  their  initial 
values  depending  upon  the  new  values  of  the  basic  variables  xB  and  reduced  costs  c. 

The  fixed-variable  variant  of  the  self-dual  algorithm  can  be  used  with  the  default  initial 
choices  of  /  and  d  (equations  (7.1.3)  and  (7.1.4))  or  with  the  normalized  covering  vector 
considered  in  the  previous  section.  The  computational  results  with  the  default  covering 
vector  are  shown  in  Tables  7. 2. 1-7. 2. 4,  while  the  results  with  the  normalized  covering 
vector  are  shown  in  Tables  7. 2. 5-7. 2. 8. 

Tables  7.2.9-7.2.10  summarize  the  effects  of  scaling  on  the  fixed-variable  variant  of  the 
self-dual  algorithm  with  both  covering  vectors.  The  results  are  varied  enough  to  suggest 
no  trends  relative  to  this  algorithm  for  scaling  or  the  choice  of  basis. 

Tables  7.2.1 1-7.2.12  compare  the  fixed- variable  (FV)  variant  of  the  self-dual  algorithm 
to  the  two-phase  algorithm  for  the  default  covering  vector  and  the  normalized  covering 
vector  (NCV).  For  both  covering  vectors,  the  number  of  iterations  decreased  for  some  of 
the  problems  (especially  when  the  problem  was  unsealed  and  using  the  UB  basis),  but 
improvements  in  CPU  time  occurred  for  only  a  few  problems. 

Fables  7.2.13-7.2.14  compares  the  fixed- variable  (FV)  variant  of  the  self-dual  algo¬ 
rithm  to  the  regular  self-dual  algorithm  for  each  of  the  covering  vectors.  This  allows  one 
to  see  if  eliminating  the  fixed  variables  first  yields  any  improvements  for  the  self-dual  al¬ 
gorithm.  There  seems  to  be  no  clear  advantage  or  clear  disadvantage  to  using  this  variant 
of  the  self-dual  algorithm. 
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Ki!lUiI33 

■HHttttl 

H55H1 

Iterations 

Iterations 

n?rn— 

4 

1 

6 

ADUTTLE 

23 

79 

12.39 

SHARE2B 

44 

63 

15.85 

mmmmm 

BEACONED 

■H 

10 

37 

9.46 

ISRAEL 

82 

322 

82.35 

ITT7I M 

■^9 

jttSfK  $jHj 

E226 

286 

148 

t&H 

Wt  vJkJL^H 

CAPRI 

48 

176 

224 

STAIR 

246 

146 

ETAMACRO 

407 

254 

661 

■si 

Table  7.2.1  Self-Dual  Algorithm  Fixed  Variable  Variant — Unsealed  with  UB  Basis 


Problem  Name 

uzsm 

HESSHI 

Total 

Mgaggg 

Iterations 

AF1RO 

4 

it 

6 

HHHf 

ADUTTLE 

28 

42 

8.25 

SHARE2B 

67 

79 

146 

SHARE1B 

IWHKjM 

■n|n  n 

BEACONED 

r  •*'  i  >,/[ 

Ha 

^k;| 

ISRAEL 

HBBi 

60 

HHVuNt  W 

BRANDY 

— Et« 

516 

161.35 

E226 

140 

133.92 

CAPRI 

^■1 

252 

291 

83.85 

177 

|OKU|| 

STAIR 

149 

141 

fmmm 

189.93 

ETAMACRO 

561 

364 

925 

412.55 

Table  7.2.2  Self-Dual  Algorithm  Fixed  Variable  Variant — Scaled  with  UB  Basis 
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Table  7.2.4  Self-Dual  Algorithm  Fixed  Variable  Variant — Scaled  with  SB  Basis 
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Table  7.2.5  Self-Dual  Algorithm  Fixed  Variable  Normalized — Unsealed  with  UB  Basis 


Problem  Nam 


KfaRH] 


ADLITTLE 

SHARE2B 


BR 
E226 
CAPRI 


B 

STAIR 

ETAMACRO 


Table  7.2.6  Self-Dual  Algorithm  Fixed  Variable  Normalized  -Scaled  with  UB  Basis 
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Table  7.2.9  Self-Dual  Algorithm  Fixed  Variable  Variant — 
Comparison  of  Scaling  versus  Nonscaling 


f-fFinrrm- 

■IH 


UB  I  SB 


tatuslUnscaled  Scaled  Unsealed  Scaled 


nscaied  7W-4L  4W-6L 


caled  4W-7L  3W-7L  I  5W-6L 


Unsealed  6W-4L  7W-5L 


caled  2W-8L  6W-5L  4W-6L 


Table  7.2.10  Self-Dual  Algorithm  Fixed  Variable  Normalized- 
Comparison  of  Scaling  versus  Nonscaling 


max 


tarting  Basis 


caling  Status 


al  (Fixed  Vars) 


ase 


i 


12 


UB 


nscaied  Scaled 


■lu 

Mu 


nscaledl  Scaled 


W-12L  2W-10L 


L  4W-7L 


Table  7.2.11  Self-Dual  Algorithm  Fixed  Variable  Variant 
Comparison  to  Two- Phase  Algorithm 


tarting  Basis 


tatus 


UB 

MIIIKIHU 

Scaled 

4W-8L 

1W-11L 

6W-4L 

JW-8L 

W-9L 


5W-6L 


I  able  7  2.12  Self  Dual  Algorithm  lived  Variable  N  <>rma!i/ed 
Comparison  to  I  wo  Phase  Algorithm 
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1W-11L 
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Starting  Basis 

SB  | 

Scaling  Status 

[trrmm 

mmmM 

firrnnm 

Scaled  | 
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Section  8.  Summary  and  Comparisons  of  Results 

In  this  section,  the  computational  results  are  used  to  compare  all  the  algorithms 
against  each  other.  Although  one  would  expect  that  one  of  the  many  possible  composite 
algorithms  would  turn  out  to  be  a  clear  winner,  the  evidence  suggests  that  no  algorithm  is 
best  for  all  problems.  The  two-phase  and  self-dual  algorithms  were  used  for  a  sensitivity 
analysis,  and  the  evidence  indicates  that  the  choice  of  the  best  algorithm  is  not  clear  even 
when  the  problems  have  the  same  sparsity  structure.  In  the  last  section,  some  suggestions 
for  further  research  are  offered. 

8.1  Comparing  Many  Algorithms 

In  the  course  of  comparing  two  different  simplex  variants  on  a  specific  linear  program,  two 
computer  runs  are  made.  It  is  important  to  be  conscious  of  any  differences  between  the  two 
runs  that  may  affect  the  interpretation  of  the  results  of  these  runs.  The  major  differences 
between  two  such  runs  will  be  in  the  choice  of  scaling  and  the  choice  of  the  initial  basis.  In 
the  earlier  sections,  the  effects  of  these  choices  for  each  algorithm  were  compared,  and  it 
seemed  that  choosing  the  SCALE  option  of  MINOS  or  either  starting  crash  basis  had  little 
effect  on  the  results.  In  this  type  of  comparison,  the  algorithm  chosen  was  a  fixed  control 
and  the  choice  of  scaling  and  the  starting  bases  were  varied.  Tables  were  also  presented 
that  compared  each  composite  algorithm  to  the  two-phase  algorithm.  Here,  the  choices  of 
scaling  and  the  starting  basis  were  fixed  when  two  algorithms  were  compared.  From  these 
pairwise  comparisons,  one  can  measure  how  often  one  algorithm  was  better  than  another. 
However,  they  offer  no  measure  as  to  how  much  better  that  algorithm  was. 

A  scoring  system  can  be  used  to  measure  the  relative  speeds  of  the  different  composite 
algorithms  to  the  two-phase  algorithm.  We  are  interested  in  comparing  the  relative  de¬ 
crease  or  increase  in  the  iteration  counts  and  CPU  time,  which  we  will  call  a  performance 
measure.  Let  qa  be  the  value  of  a  performance  measure  for  Algorithm  a  (a  =  0,1)  on 
some  problem.  Suppose  that  Algorithm  0  is  the  two-phase  algorithm,  whose  performance 
measures  will  be  used  as  the  standard  for  comparing  all  of  the  other  algorithms.  Then  the 
performance  factor  pi  for  Algorithm  i  is  defined  as 

P,  =  100(  — ).  (8.1.1) 

9o 

Hence,  if  the  performance  measure  being  used  was  CPU  time,  then  a  value  p\  =  120 
would  mean  that  Algorithm  1  was  20  percent  slower  than  the  two-phase  algorithm,  while, 
similarly,  a  value  pi  =  70  would  mean  that  Algorithm  1  was  30  percent  faster  than  the 
two-phase  algorithm.  It  should  be  noted  that  the  performance  factors  for  the  two- phase 
algorithm  will  always  be  100. 

Performance  factors  can  be  computed  by  comparing  the  runs  of  two  algorithms  on 
the  same  problem  using  the  same  scaling  option  and  the  same  starting  basis.  Hence,  for 
each  starting  basis,  scaling  option,  and  algorithm,  12  performance  factors  are  computed  (1 
for  each  problem).  The  minimum,  maximum,  and  average  values  of  these  12  performance 
factors  are  then  computed.  In  the  tables  that  follow,  the  performance  factors  are  rounded 
to  the  nearest  integer.  Performance  factors  are  computed  for  iteration  counts  and  CPI 
time. 

Table  8.1.1  presents  the  performance  factors  for  iterations  counts.  Each  algorithm  has 
at  least  one  problem  for  which  it  gives  a  24  percent  improvement  in  the  number  of  iterations 
as  compared  to  the  two-phase  algorithm.  The  average  number  of  iterations  was  in  most 
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Table  8.1.2  Performance  Factors  for  CPU  Times 


forms  better  than  any  other  algorithm.  For  each  algorithm,  there  is  at  least  one  problem 
for  which  that  algorithm  is  best.  Given  a  problem,  there  seems  to  be  no  rule  that  allows 
one  to  choose  an  algorithm  which  will  perform  best  on  that  problem. 
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8.2  Sensitivity  Analysis 

Users  of  commercial  linear  programming  systems  frequently  solve  a  particular  linear  pro¬ 
gram  and  then  solve  some  perturbation  of  that  problem.  This  perturbation  usually  is  of 
the  same  structure  as  the  original  problem,  and  may  only  differ  in  the  elements  of  A ,  b, 
and  c.  It  is  often  useful  to  solve  such  a  perturbed  problem  by  using  an  optimal  basis  of 
the  original  problem  as  the  starting  basis  for  the  new  problem.  Usually,  only  a  few  pivots 
of  the  simplex  method  must  be  done  to  reach  the  optimal  basis  of  the  new  problem. 

A  perturbation  as  described  above  can  cause  the  original  optimal  basis  to  become 
either  primal  infeasible,  dual  infeasible,  or  both.  In  the  case  of  primal  infeasibility,  the 
dual  simplex  method  is  usually  used.  In  the  case  of  dual  infeasibility,  Phase  II  of  the 
two-phase  algorithm  is  used.  When  both  the  primal  and  dual  problems  are  infeasible,  the 
two-phase  algorithm  is  usually  used.  In  such  a  case,  however,  the  self-dual  algorithm  may 
perform  well,  since  only  a  few  iterations  may  be  needed. 

In  order  to  analyze  the  application  of  the  self-dual  algorithm  for  sensitivity  analysis,  it 
is  necessary  to  determine  a  “real-world”  perturbation  of  a  “real-world”  problem.  A  random 
perturbation  of  a  problem  would  most  likely  destroy  any  degeneracies  that  were  present. 
Fortunately,  the  literature  contains  a  good  example  of  a  “real-world”  perturbation. 

Manne  (1973a)  described  the  model  DINAMICO  in  enough  detail  to  allow  the  asso¬ 
ciated  linear  program  to  be  defined  in  a  modeling  language  such  as  GAMS  (Bisschop  and 
Meeraus,  1982).  GAMS  can  be  used  to  create  an  input  file  for  MINOS  to  use  to  solve  the 
problem.  This  procedure  was  followed,  and  the  optimal  basis  for  the  problem  was  saved. 
The  problem  has  319  rows,  424  columns,  and  4157  nonzeroes.  Manne  (1973b)  (pg.  152) 
later  described  different  alternatives  to  the  initial  problem  that  can  be  solved.  All  the 
alternatives  described  leave  the  sparsity  structure  of  the  matrix  A  unchanged,  but  change 
only  either  the  primal  or  dual  infeasibility  of  the  original  problem,  not  both  simultaneously. 

Alternative  cases  1  and  2  from  Manne’s  paper  vary  the  growth  rate  g  used  in  the 
model.  Varying  this  rate  changes  the  primal  infeasibility  of  the  optimal  basis.  Alternative 
case  5  from  Manne's  paper  makes  the  optima!  basis  dual  infeasible.  Both  of  these  variations 
were  made  simultaneously,  and  the  growth  rate  g  was  varied  from  a  value  of  4  percent  to 
12  percent.  The  case  g  =  .07  corresponds  to  the  optimal  basis.  Note  that  when  g  =  .06, 
the  optimal  basis  for  the  case  g  =  .07  was  the  same,  and  hence  the  only  computations  that 
MINOS  performed  were  to  determine  that  the  initial  basis  was  optimal  without  performing 
any  pivots. 
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Table  8.2.1  shows  the  results  for  the  two-phase  algorithm  and  the  self-dual  algorithm 
with  the  default  covering  vector.  Note  that  the  self-dual  algorithm  had  competitive  itera¬ 
tion  counts  for  most  of  tne  cases. 


0  Two-Phase  ■  Self-Dual 


Growth  Rate  g 


Figure  8.2.2  CPU  Times  for  Sensitivity  Analysis 


It  is  interesting  to  compare  the  CPU  time  results  using  a  graph  as  in  Figure  8.2.2. 
The  better  CPU  times  are  indicated  by  smaller  bars.  The  self-dual  algorithm  outperforms 
the  two-phase  algorithm  when  g  =  .05  and  g  =  .08.  It  is  interesting  to  note  the  difference 
in  performance  for  g  =  .10  as  compared  to  g  —  .09,  .11,  or  .12.  It  seems  that  the  self-dual 
algorithm  is  better  for  small  perturbations;  conversely,  the  two-phase  algorithm  is  better 
for  large  perturbations.  However,  it  should  be  noted  that  the  amount  of  CPU  time  did 
not  increase  monotonically  as  g  increased.  This  suggests  that  it  is  difficult  to  predict  the 
performance  of  either  algorithm  for  arbitrary  values  of  g. 

All  these  problems  were  run  from  the  same  crash  basis.  The  problems  have  the 
same  structure  for  A,  but  approximately  120  of  the  nonzeros  differ  in  value.  These  results 
suggest  again  that  no  one  algorithm  will  perform  consistently  better  than  another  on  every 
problem,  even  when  the  problems  have  the  same  structure,  i.e.,  problems  which  have  the 
same  pattern  of  nonzero  data  in  A ,  but  different  values  for  some  of  those  nonzeroes. 
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8.3  Summary  and  Further  Research 

The  computational  results  contained  in  this  study  exhibit  the  sensitivity  of  the  simplex 
method  to  different  schemes  for  choosing  the  incoming  and  outgoing  variables.  Intuitively, 
one  would  expect  that  a  composite  algorithm  would  perform  better  than  the  two-phase 
algorithm,  but  only  the  weighted  objective  algorithm  was  consistently  competitive.  The 
two-phase  algorithm  and  weighted  objective  method  have  a  good  average  performance. 
Other  algorithms  seem  to  perform  better  than  the  two-phase  algorithm  on  a  few  problems, 
indicating  that  no  rule  exists  that  allows  one  to  choose  the  best  method  for  a  particular 
problem.  The  drawback  of  most  of  these  alternative  methods  is  that  they  require  more 
work  per  iteration  than  the  two-phase  algorithm  to  choose  s  and  r.  They  can  only  be 
faster  than  the  two-phase  algorithm  if  the  number  of  iterations  is  significantly  lowered  (30 
percent  or  more).  Probably  the  best  way  to  choose  a  variant  of  the  simplex  method  is  to 
use  the  two-phase  algorithm  first.  If  it  performs  poorly,  then  trying  another  variant  of  the 
simplex  method  may  be  in  order.  If  n  is  much  larger  than  m,  the  two-phase  algorithm, 
weighted  objective  method,  and  the  Markowitz  criterion  can  each  be  used  with  partial 
pricing. 

Some  open  questions  that  remain  are; 

1.  What  happens  to  a  class  of  problems  that  are  related  in  structure,  but  grow  in  size? 

2.  Do  there  exist  other  variants  of  the  simplex  method  that  will  consistently  do  better 
than  the  two-phase  algorithm? 

3.  Are  there  variants  that  have  advantages  over  others  in  degenerate  situations? 

4.  Can  the  methods  of  artificial  intelligence  be  used  to  analyze  a  problem  in  advance, 
and  then  dynamically  vary  the  choice  of  variant  of  the  simplex  method  to  use  as  the 
computation  progresses? 

5.  Can  one  develop  a  matrix  generator  for  creating  problems  that  are  in  some  sense  ran¬ 
dom  and  yet  representative  of  the  variety  of  structures  found  in  real-world  problems? 
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For  almost  forty  years,  the  slaplex  method  has  been  the  method  for 
choice  for  solving  linear  programs.  The  method  consists  of  first  finding  a 
feasible  solution  to  the  problem  (Phase  I),  followed  by  finding  the  optimum 
(Phase  II).  Many  algorithms  have  been  proposed  which  try  to  combine  the 
processes  embedded  in  the  two-phase  process.  This  study  will  compare  the 
merits  of  some  of  these  composite  algorithms. 

Theoretical  and  computational  aspects  of  trie  Weighted  Objective, 
Self-Dual  Parametric,  and  Markowitz  Criteria  algorithms  are  presented. 
Different  variants  of  the  Self-Dual  methods  are  discussed. 

A  large  amount  of  computational  experience  for  each  algorithm  is 
presented.  These  results  are  used  to  compare  the  algorithms  in  various 
ways.  The  implementatons  of  each  algorithm  are  also  discussed.  One  theme 
that  is  present  throughout  all  of  the  computational  experience  is  that  there 
is  no  one  algorithm  which  is  the  best  algorithm  for  all  problems. 
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